Skip to content

Rank function #1731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
0x0L opened this issue Nov 20, 2017 · 4 comments
Closed

Rank function #1731

0x0L opened this issue Nov 20, 2017 · 4 comments

Comments

@0x0L
Copy link
Contributor

0x0L commented Nov 20, 2017

Hi,

I think xarray is missing a rank function.
Is there any reason not to expose a wrapper to bottleneck.nanrankdata ?

See also #1635

[edit]
Although moving rank is mentioned in the whats-new for v0.9.2 I wasn't able to find that functionality nor a trace of it in the code :)

@jhamman
Copy link
Member

jhamman commented Nov 21, 2017

Is there any reason not to expose a wrapper to bottleneck.nanrankdata

@0x0L - I don't think so and I think we'd be open to adding this function. Even better if there is a fallback numpy equivalent but I don't think that would be required.

I looked at the (my) whatsnew note from 0.9.2 and I it seems we decided to remove this option until there is a rank method for dataarray/dataset objects. See @shoyer's comment: #1278 (comment)

@max-sixty
Copy link
Collaborator

Great idea.

We use bottleneck.nanrankdata manually; I don't think there's a vectorized numpy fallback

@0x0L
Copy link
Contributor Author

0x0L commented Nov 21, 2017

A few points:

  • nanrankdata only support numeric types, how about wrapping rankdata with a keyword option ?
  • There's also a push push method similar to pandas ffill which would be nice.
  • move_rank outputs are not normalized as rankdata, this might be a bit confusing

@0x0L 0x0L mentioned this issue Nov 21, 2017
4 tasks
@shoyer
Copy link
Member

shoyer commented Nov 21, 2017

nanrankdata only support numeric types, how about wrapping rankdata with a keyword option ?

We already do dispatching to appropriate functions based on the dtype for aggregations:

def _create_nan_agg_method(name, numeric_only=False, np_compat=False,

(Yes, this is a bit of a mess)

Since NaN has a consistent sorting position in NumPy/bottleneck (it sorts to the end), I would suggest including a skipna keyword argument, like one we use for aggregation functions. Alternatively, we could use na_option : {‘keep’, ‘top’, ‘bottom’} like pandas.

There's also a push push method similar to pandas ffill which would be nice.

@jhamman is already working on ffill/bfill in #1640

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants