You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using a Series in a groupby, the byT type is supposed to be tuple because Series is included in the TypeVarGroupByObjectNonScalar, but since there is only one Series, it should not be a tuple, but rather the type of the Series we use in the groupby.
See how the GroupByObjectNonScalar is declared here :
It seems to me that the GroupByObjectNonScalar is a bit too broad and includes both objects representing only one grouper and multiple groupers (where the SeriesGroupBy[S1, tuple] is consistent). So maybe we should divide this TypeVar in two and change the groupby overloads so that when only one grouper is present, the type is Any ?
Note that I did not include the np.ndarray object in MonoGroupByObjectNonScalar since it can be 2D. Not sure how this one should be treated.
Of course this proposition is just an illustration, I suppose such division would call for many other modification elsewhere, especially within core/frame.pyi and that there might be better solution.
To Reproduce
The following example shows that the type of output is deduced to be list[tuple[tuple, Series]] while it should be list[tuple[bool, Series]]
I suppose it is but only if we want to groupby by a Series with the dtype as the first Series. Would it be possible then to have another typevar similar to S1, but just not correlated ? Would I need to create this new S2 TypeVar somewhere ?
If you want Series[S1].groupby() to allow a Series[S2] as an argument, where S1 and S2 are really covering the same types, but you want the result of SeriesGroupBy[S1, S2], then you would need to duplicate the TypeVar of S1 and define S2. We had to do that with HashableT# .
Note: I'm pretty sure this is what you'd need to do to get it to work, but not 100% sure.
Describe the bug
When using a
Series
in agroupby
, thebyT
type is supposed to betuple
becauseSeries
is included in theTypeVar
GroupByObjectNonScalar
, but since there is only oneSeries
, it should not be a tuple, but rather the type of the Series we use in the groupby.See how the
GroupByObjectNonScalar
is declared here :pandas-stubs/pandas-stubs/_typing.pyi
Line 435 in 70c412c
Series
andlist[Series]
, same goes withFunction
orGrouper
probably.See how the resulting
groupby
is decided to beSeriesGroupBy[S1, tuple]
here whenby
is either aMultiIndex
or aGroupByObjectNonScalar
:pandas-stubs/pandas-stubs/core/series.pyi
Line 627 in 70c412c
It seems to me that the
GroupByObjectNonScalar
is a bit too broad and includes both objects representing only one grouper and multiple groupers (where theSeriesGroupBy[S1, tuple]
is consistent). So maybe we should divide thisTypeVar
in two and change thegroupby
overloads so that when only one grouper is present, the type isAny
?_typing.pyi
core/series.pyi
Note that I did not include the
np.ndarray
object inMonoGroupByObjectNonScalar
since it can be 2D. Not sure how this one should be treated.Of course this proposition is just an illustration, I suppose such division would call for many other modification elsewhere, especially within core/frame.pyi and that there might be better solution.
To Reproduce
The following example shows that the type of output is deduced to be
list[tuple[tuple, Series]]
while it should belist[tuple[bool, Series]]
output when running with python :
Output when running with pyright
This happens because the
list
function calls the__iter__
method which is typed here :pandas-stubs/pandas-stubs/core/groupby/generic.pyi
Line 143 in 70c412c
Please complete the following information:
pandas-stubs
: 2.0.1.230501The text was updated successfully, but these errors were encountered: