Skip to content

Slice with negative step inconsistent with python result #312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
minstrel271 opened this issue Apr 20, 2017 · 4 comments
Open

Slice with negative step inconsistent with python result #312

minstrel271 opened this issue Apr 20, 2017 · 4 comments

Comments

@minstrel271
Copy link

minstrel271 commented Apr 20, 2017

First of all, I understand that this crate is not meant to reproduce all feature in python, but since in the docs about Si python slice is used as comparison, so I believe this is worth to bring up.

Here is some test case with ndarray:

#[macro_use(s)]
extern crate ndarray;
use ndarray::prelude::*;


fn main() {
    let x: Array1<usize> = Array::zeros(16);

    println!("{}", x.slice(s![..4;1]).len());
    println!("{}", x.slice(s![4..;1]).len());
    println!("{}", x.slice(s![..-4;1]).len());
    println!("{}", x.slice(s![-4..;1]).len());
    println!("{}", x.slice(s![..4;-1]).len());
    println!("{}", x.slice(s![4..;-1]).len());
    println!("{}", x.slice(s![..-4;-1]).len());
    println!("{}", x.slice(s![-4..;-1]).len());
}

which return:

4
12
12
4
4
12
12
4

and in the corresponding python test case (in both python 2.7.6 and python 3.4.3):

def main():
    x = [0] * 16
    print(len(x[:4:1]))
    print(len(x[4::1]))
    print(len(x[:-4:1]))
    print(len(x[-4::1]))
    print(len(x[:4:-1]))
    print(len(x[4::-1]))
    print(len(x[:-4:-1]))
    print(len(x[-4::-1]))


if __name__ == '__main__':
    main()

which return:

4
12
12
4
11
5
3
13

It seems that, with negative step, the default begin is n - 1 and the default end is 0 for python. To be honest, I prefer the ndarray approach to kept the default consistance, but I believe this should either mention in the docs or changed.

@bluss
Copy link
Member

bluss commented Apr 20, 2017

Interesting. This was definitely a surprise to me:

In [25]: x = np.arange(16.)

In [26]: x[:4:-1]
Out[26]: array([ 15.,  14.,  13.,  12.,  11.,  10.,   9.,   8.,   7.,   6.,   5.])

@jturner314
Copy link
Member

jturner314 commented Nov 24, 2017

I just noticed this. In #377, I copied the existing behavior.

Other than the Python and ndarray 0.10 approaches, there's a third option I think is the best fit for ndarray. These are examples of the three options (indexing an axis of length 4 (indices 0, 1, 2, 3)):

  1. Python-style
    1. 2..0;-12, 1
    2. ..0;-23, 1
    3. 3..0;-23, 1
    4. 3..;-23, 1
    5. ..;-23, 1
  2. ndarray 0.10
    1. 1..3;-12, 1
    2. 1..;-23, 1
    3. 0..4;-23, 1
    4. ..4;-23, 1
    5. ..;-23, 1
  3. new approach
    1. 1..3;-12, 1
    2. 1..;-23, 1
    3. 0..4;-22, 0
    4. ..4;-22, 0
    5. ..;-22, 0

Option 1 makes sense to me since I've used Python for a while, but it doesn't match the documentation for std::ops::Range (which specifies that start..end contains values x with x >= start and x < end).

Option 2 matches the docs for std::ops::Range but behaves in surprising ways for step < -1. Examples i and ii make sense, but examples iii and iv are surprising because none of the indices are equal to start or end.

I think option 3 is the best fit for ndarray. It matches the docs for std::ops::Range, and the last index always equals start. With this approach, start..end;-step is the same as reverse(start..end;step).

What do you think? If you'd like to change the behavior, the 0.11 release would be a good time to do so. Either way, I can add more documentation to explain how negative steps are handled.

@bluss
Copy link
Member

bluss commented Nov 24, 2017

Great to see the details!

My default would be to not change the behaviour, so it would be best to have both the logical reason and a technical reason (like: fits this algorithm better or can be implemented with less conversion / checking loss etc).

As a rebuttal, the last element in 0..4 is a 3, and that's the reason that is the first element when the step is negative. It does seem to indicate a difference in perspective. Ndarray seems to think that we first slice, then reverse, then count steps.

Python just seems to do one thing differently -- it swaps the ends of i..j to be j..i when the step is negative, and it also swaps which of them is inclusive? I suppose swapping inclusiveness is not how it should be understood, but that's what it looks like compared to Rust ranges. (And in Rust we have a precedent for not doing that -- we use (i..j).rev() but not j..i where i <= j.)

@jturner314
Copy link
Member

I don't have a strong preference, although I think options 2 and 3 fit Rust conventions a little better than option 1.

Python's approach (option 1) to start:end:step is "the first index is start, then add step until reaching end (exclusive)".

ndarray 0.10's approach (option 2) can be expressed in a few different (equivalent) ways:

  • "If step is positive, then the first index is start, and then add step until reaching end (exclusive). If step is negative, then the first index is end - 1, and then add step until reaching start (inclusive)."
  • "First perform a slice with start..end. If step is negative, reverse the slice. Start at the front of the (possibly reversed) slice, and add step.abs() until reaching the back of the slice (inclusive)."
  • "First perform a slice with start..end. If step is positive, start with the front of the slice; if step is negative, start with the back of the slice. Then, add step until reaching the other end of the slice (inclusive)."

Option 3's approach is "If step is positive, then the first index is start, and then add step until reaching end (exclusive). If step is negative, then the indices are the same as if step were positive; just reverse the order."

Note that if we keep ndaray 0.10's behavior (option 2), it would be easy to add a s_py![a;b;c, d;e;f, ...] macro which interprets its arguments like Python and returns a SliceInfo instance. This might be useful for users who want to translate NumPy algorithms to ndarray. I don't think it's possible to do this if we change the default behavior to option 3, because the lengths of the axes would need to be known.

My default would be to not change the behaviour, so it would be best to have both the logical reason and a technical reason (like: fits this algorithm better or can be implemented with less conversion / checking loss etc).

Keeping the existing behavior fine with me. I don't see any technical reason to prefer one approach over another (other than the ability to add s_py! if we keep ndarray 0.10's behavior). It's also worth noting that if we change the behavior, it might be difficult for users to update their code if they're doing complex logic with slice indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants