fillna() does not work when value parameter is a list

Should raise on a passed list to `value`

The results from the fillna() method are very strange when the value parameter is given a list. 

For example, using a simple example DataFrame:

> df = pandas.DataFrame({'A': [numpy.nan, 1, 2], 'B': [10, numpy.nan, 12], 'C': [[20, 21, 22], [23, 24, 25], numpy.nan]})
> df
>     A   B             C
> 0 NaN  10  [20, 21, 22]
> 1   1 NaN  [23, 24, 25]
> 2   2  12           NaN
> 
> df.fillna(value=[100, 101, 102])
>      A    B             C
> 0  100   10  [20, 21, 22]
> 1    1  101  [23, 24, 25]
> 2    2   12           102

So it appears the values in the list are used to fill the 'holes' in order, if the list has the same length as number of holes. But if the the list is shorter than the number of holes, the behavior changes to using only the first value in the list:

>  df.fillna(value=[100, 101])
>      A    B             C
> 0  100   10  [20, 21, 22]
> 1    1  100  [23, 24, 25]
> 2    2   12           100

If the list is longer than the number of holes, you get something even more odd:

> df.fillna(value=[100, 101, 102, 103])
>      A    B             C
> 0  100   10  [20, 21, 22]
> 1    1  100  [23, 24, 25]
> 2    2   12           102

If you specify provide a dict that specifies the fill values by column, the values from the list are used within that column only:

> df.fillna(value={'C': [100, 101]})
>     A   B             C
> 0 NaN  10  [20, 21, 22]
> 1   1 NaN  [23, 24, 25]
> 2   2  12           100

Since it's not always practical to know the number of NaN values a priori, or to customize the length of the value list to match it, this is problematic. Furthermore, some desired values get over-interpreted and cannot be used:

For example, if you want to actually replace all NaN instances in a single column with the same list (either empty or non-empty), I can't figure out how to do it:

> df.fillna(value={'C': [[100,101]]})
>     A   B             C
> 0 NaN  10  [20, 21, 22]
> 1   1 NaN  [23, 24, 25]
> 2   2  12           100

Indeed, if you specify the empty list nothing is filled:

> df.fillna(value={'C': list()})
>     A   B             C
> 0 NaN  10  [20, 21, 22]
> 1   1 NaN  [23, 24, 25]
> 2   2  12           NaN

But a dict works fine:

> f.fillna(value={'C': {0: 1}})
>     A   B             C
> 0 NaN  10  [20, 21, 22]
> 1   1 NaN  [23, 24, 25]
> 2   2  12        {0: 1}
> 
> df.fillna(value={'C': dict()})
>     A   B             C
> 0 NaN  10  [20, 21, 22]
> 1   1 NaN  [23, 24, 25]
> 2   2  12            {}

So it appears the fillna() is making a lot of decisions about how the fill values should be applied, and certain desired outcomes can't be achieved because it's being too 'clever'.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fillna() does not work when value parameter is a list #3435

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

fillna() does not work when value parameter is a list #3435

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions