Correct return type of sum() builtin #1582

teepark · 2017-08-29T20:27:24Z

sum([]) always returns the integer 0.

JelleZijlstra · 2017-09-23T12:59:56Z

Not sure about this; we tend to avoid union return types.

teepark · 2017-09-25T20:32:21Z

Some context on why I'm changing this: I had code that would sum a list of decimal.Decimal, and then I attempted to .quantize() the returned value, which resulted in a production bug when the list was empty for the first time.

mypy is in a position to know that the return value could possibly be an integer. Can you clarify why union return types are avoided? It's certainly more accurate in this case.

JukkaL · 2017-09-26T09:06:08Z

One problem with union in this context is that some code may be careful to never pass an empty iterable to sum() and thus it will never return an int. In this case a union return type could generate false positives. I don't have a strong opinion either way, though.

teepark · 2017-09-26T14:24:23Z

Thanks for the response. That makes sense and I imagine this change would be more painful for people being careful to never pass an empty list.

But ultimately that's a dependency only on runtime data, which mypy isn't able to take into consideration. It's not really any different from a function that returns different types in the branches of an if statement -- mypy can only accurately model it as a union, even though code using the function might have more detailed knowledge based on how it crafted the arguments.

gvanrossum · 2017-09-26T15:59:07Z

It's similar to pow(), where usually pow(x, y) with int arguments is expected to return an int, but if y is negative it will return a float. We solved that with a plugin because usually y is a compile-time constant expression. But before it just returned Any.

Maybe Any is a better return type for sum()? I'm not sure -- it seems there's no good answer here, and a plugin won't help because a plugin can only tell whether the list is empty at compile time, which is no use for a typical sum() call.

One obvious improvement is to create two overloads, sum(x) and sum(x, y). The latter does not return an int, it returns y.

matthiaskramm · 2017-09-27T15:09:39Z

For pytype, we would write

@overload
def sum(iterable: Iterable[nothing]) -> int: ...
@overload
def sum(iterable: Iterable[_T], start: _T = ...) -> _T: ...

(See e.g. the pytype definition of "reduce", https://github.com/google/pytype/blob/master/pytype/pytd/builtins/__builtin__.pytd#L92)

However, neither mypy nor PEP 484 has a concept of "nothing" so this doesn't work here.
(It also doesn't work because mypy doesn't stop, after the first matching @overload)

gvanrossum · 2017-09-27T15:21:45Z

But how would you know whether a particular variable containing a list of integers can be empty or not? Otherwise this won't be very useful -- unlike pow(), sum() is rarely called with an empty list literal -- it's called with a variable that may or may not be guaranteed to be non-empty.

teepark · 2017-09-27T17:57:07Z

@gvanrossum I agree the overload is a definite improvement.

As a starting point, here's what I believe we'd need to accurately model all behavior of sum():

@overload
def mysum(iterable: Iterable[_T]) -> Union[_T, int]: ...

@overload
def mysum(iterable: Iterable[_T], start:_S) -> Union[_T, _S]: ...

In the two argument version we need a second type variable and the union because there's nothing outright requiring that the second argument be the same type as the items in the iterable, and promotion can happen either way:

sum([1, 2, 3], 4.5)  # float (start)
sum([1.0, 2.0, 3.5], 4)  # float (iterable)

Do we want to model all that behavior? I would have found it helpful for mypy to notify me of the problem with my blind sum(my_decimals).quantize(precision), but if we want to optimize for the (probably?) common case where the iterable can't possibly be empty that makes sense too.

gvanrossum · 2017-09-27T18:08:34Z

Yes, this sounds right. We do this for e.g. dict.get() as well.

`sum([])` always returns the integer 0.

Correct return type of sum() builtin

e33390b

`sum([])` always returns the integer 0.

teepark force-pushed the sum-empty-int branch from 421e151 to e33390b Compare September 27, 2017 18:13

JelleZijlstra approved these changes Oct 5, 2017

View reviewed changes

JelleZijlstra merged commit 355f30c into python:master Oct 5, 2017

teepark deleted the sum-empty-int branch October 5, 2017 16:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Correct return type of sum() builtin #1582

Correct return type of sum() builtin #1582

Uh oh!

teepark commented Aug 29, 2017

Uh oh!

JelleZijlstra commented Sep 23, 2017

Uh oh!

teepark commented Sep 25, 2017

Uh oh!

JukkaL commented Sep 26, 2017

Uh oh!

teepark commented Sep 26, 2017

Uh oh!

gvanrossum commented Sep 26, 2017

Uh oh!

matthiaskramm commented Sep 27, 2017 •

edited

Loading

Uh oh!

gvanrossum commented Sep 27, 2017

Uh oh!

teepark commented Sep 27, 2017

Uh oh!

gvanrossum commented Sep 27, 2017 via email

Uh oh!

Uh oh!

Uh oh!

Correct return type of sum() builtin #1582

Correct return type of sum() builtin #1582

Uh oh!

Conversation

teepark commented Aug 29, 2017

Uh oh!

JelleZijlstra commented Sep 23, 2017

Uh oh!

teepark commented Sep 25, 2017

Uh oh!

JukkaL commented Sep 26, 2017

Uh oh!

teepark commented Sep 26, 2017

Uh oh!

gvanrossum commented Sep 26, 2017

Uh oh!

matthiaskramm commented Sep 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gvanrossum commented Sep 27, 2017

Uh oh!

teepark commented Sep 27, 2017

Uh oh!

gvanrossum commented Sep 27, 2017 via email

Uh oh!

Uh oh!

matthiaskramm commented Sep 27, 2017 •

edited

Loading