Skip to content

Add support for slicing with subviews #369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 30 commits into from

Conversation

jturner314
Copy link
Member

Please don't merge this PR immediately. It needs some refinement, including updated documentation and more tests.

This is an implementation of #215, adding support for taking subviews while slicing. A few of the changes are:

  • The s! macro can take individual indices (to indicate subviews) in addition to ranges.
  • The s! macro returns an owned SliceInfo instance instead of just a reference, so the caller can easily pass it around between methods.
  • The various *slice*() methods now take a SliceInfo instance (or reference).
  • There's now a slice_into() method. Note that ideally we'd come up with a better name than slice_into because there's already an into_slice() method.
  • ArrayBase now has three additional methods *slice_axis*() that slice along a single axis.

The primary disadvantage of this change is that it's not easy to create a SliceInfo instance without the s! macro. If that's an important feature, it would be possible to do any (or all) of the following:

  • Add impl<'a> From<&'a [SliceOrIndex]> for SliceInfo<Vec<SliceOrIndex>, IxDyn>.
  • Add a macro to convert [SliceOrIndex; n] into SliceInfo<[SliceOrIndex; n], D>, where D is determined at compile time.
  • Add a function to convert [SliceOrIndex; n] into SliceInfo<[SliceOrIndex; n], D>, where the caller would have to specify D, and D would be checked at runtime.

Note that this PR is a breaking change.

Will you please let me know what you think?

The same assertion occurs in `self.islice()`, so this one is
redundant.
@bluss
Copy link
Member

bluss commented Oct 19, 2017

I think it looks great. I don't always have time during the week to give so much feedback, not if I want it to be well thought out 😄 (Yes serialization for IxDyn was implemented in a half sedated state.)

I think I'll only have minor comments and curious questions through out

pub fn slice<I, T, Do>(&self, info: I) -> ArrayView<A, Do>
where
I: Borrow<SliceInfo<T, Do>>,
T: Borrow<D::SliceArg>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I would like to avoid the T parameter and simplify the types here, ideally using SliceInfo<D::SliceArg, ..., but I see that it's not exactly easy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Removing the T parameter is difficult because s! always returns a SliceInfo<T, D> where T is a fixed-length array. The constraint could be I: Borrow<SliceInfo<D::SliceArg, Do>> for Ix0 to Ix6, but this wouldn't work for IxDyn because IxDyn::SliceArg = [SliceOrIndex]. There are a few possible alternatives:

  • slice() could be implemented separately for each dimension D, so for Ix0 to Ix6, the T parameter wouldn't be necessary.
  • I think it would be possible to change the constraint to &'a I: Into<SliceInfo<&'b D::SliceArg, Do>>, but that adds lifetime parameters, which IMHO are worse than a type parameter.
  • If IxDyn::SliceArg = Vec<SliceOrIndex>, then I think it would be possible to change the constraint to I: Into<SliceInfo<D::SliceArg, Do>>, but that would mean that slice() would take ownership of the info parameter, which is not desirable.
  • If IxDyn::SliceArg = Vec<SliceOrIndex>, then I think it would be possible to change the constraint to &'a I: Into<SliceInfo<D::SliceArg, Do>>, but that would require the info argument to be a reference, which is not really desirable (see below).

It would be possible to remove the I parameter and instead make the signature pub fn slice<T, Do>(&self, info: &SliceInfo<T, Do>) -> ArrayView<A, Do>. However, that would mean that info has to be a reference, which would make usage with the s! macro less clean: arr.slice(&s![1..4, 3]). (It would be possible to make the s! macro to return &SliceInfo instead of SliceInfo, but as far as I know, then the &SliceInfo couldn't leave the scope where it was created, so it couldn't be returned from a function.)

In conclusion, in my limited Rust knowledge, the best options are requiring slice(&s![]) or requiring the extra I parameter. If you'd like me to remove I and require slice(&s![]), or if you have any other ideas, please let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look at this in detail late. Looks possible to use &SliceInfo< [SliceorIndex]> for the ixdyn case, as long as the macro makes that but a fixed size array instead.

/// (**Panics** if `D` is `IxDyn` and `info` does not match the number of array axes.)
pub fn islice<I>(&mut self, info: I)
where
I: AsRef<[SliceOrIndex]>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method loses the static length check this way. It would be ideal to keep it, but it's a less used method so not that important.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I just pushed a fix to keep compile-type length checking.

///
/// **Panics** if an index is out of bounds or stride is zero.<br>
/// **Panics** if `axis` is out of bounds.
pub fn slice_axis(&self, axis: Axis, indices: &Si) -> ArrayView<A, D> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great little methods, they are overdue! -- Si is a simple copy type, so I'd pass it by reference or remodel the argument here (break into its parts, or?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean pass-by-value instead of pass-by-reference? If so, I just pushed a commit to change it to pass-by-value. Keeping the argument as Si instead of breaking it into parts is nice because there are convenient From<Range*> for Si implementations.

@jturner314
Copy link
Member Author

I don't always have time during the week to give so much feedback, not if I want it to be well thought out.

I understand, no worries. :) Thanks for looking at it today!

I think it looks great.

Since you like the approach, I'll work on updating the documentation and adding more tests.

@bluss
Copy link
Member

bluss commented Oct 20, 2017

Minor note, we don't have to worry too much about Rust < 1.21. We have promised in our docs to jump to newer rust versions to advance the library, and Rust 1.21 brings the needs_drop function which will be a great help to us.

@jturner314
Copy link
Member Author

jturner314 commented Oct 20, 2017

I successfully removed the T type parameter by using the approach you suggested (removing out_ndim and implementing impl<T: Borrow<[SliceOrIndex]>, D: Dimension> Borrow<SliceInfo<[SliceOrIndex], D>> for SliceInfo<T, D>).

Edit: I'm a little uncomfortable about the transmute because, according to the Rustonomicon, transmuting between non-repr(C) types is UB, although I think it should always work in practice.

Edit2: It turns out I can replace the transmute with a pointer cast and dereference, which seems safer. I just pushed a new commit.

@bluss
Copy link
Member

bluss commented Oct 24, 2017

Looks cool! The now we can make the final transition to &SliceInfo<D::SliceArg, DimOut>

@jturner314
Copy link
Member Author

jturner314 commented Oct 24, 2017

@bluss I'm working on cleaning up this PR in preparation for merging. I have a few questions:

  1. Should Si be renamed to something more descriptive, e.g. Slice?
  2. Is the name slice_into() okay? It's quite similar to the name of the existing into_slice() method.
  3. Are you interested in supporting usize ranges and indices? As a user, I think this would be very useful because most of my code works with usize values for lengths and indices, not isize. If so, how would you like to handle them? I see two options:
    • Cast usize to isize when creating the Si or SliceOrIndex object. The issue with this approach is that it doesn't work correctly if the value is > std::isize::MAX. (By the way, slicing in ndarray 0.10.12 is currently broken if the axis length is > std::isize::MAX, see below for an example.)
      let x = Array2::<u8>::zeros((std::isize::MAX as usize, 0));
      x.slice(s![0..1, ..]);
      println!("Succeeded!");
      x.slice(s![-1.., ..]);
      println!("Succeeded!");
      
      let x = Array2::<u8>::zeros((std::isize::MAX as usize + 1, 0));
      x.slice(s![0..1, ..]);
      println!("Succeeded!");
      x.slice(s![-1.., ..]);
      // thread 'main' panicked at 'attempt to add with overflow'
      
    • The option I prefer is to have separate Si and SliceOrIndex types/variants for usize and isize. This will allow correct operation even when the axis length is > std::isize::MAX.

@bluss
Copy link
Member

bluss commented Oct 24, 2017

  1. As first priority, try to incorporate Si into SliceOrIndex. Else it should have a longer name
  2. slice_into seems like a problematic name, unsure
  3. Yes it would be great to support usize as well. I'd be fine with having that in a separate PR. We don't have a range problem; axis length will never be greater than isize::MAX; Rust does not support objects larger than that IIUC.
  • What I've seen before is that it would need to implement all of isize, usize, and i32 -> SliceOrIndex conversions, if it should work with literals like 0 that are not typed (they fall back to i32).

@jturner314
Copy link
Member Author

  1. I incorporated Si into SliceOrIndex.

  2. One suggestion is to rename the existing into_slice method (as well as as_slice and as_slice_mut) to something more descriptive. Some possibilities are into_slice_standard, into_slice_std, or into_slice_std_order (to emphasize that the data must be in standard order); into_flat_slice (to emphasize that the data is flattened); or into_std_slice (to emphasize that the return value is the slice type from the standard library). These names would be sufficiently different from slice_into to avoid confusion.

  3. axis length will never be greater than isize::MAX; Rust does not support objects larger than that IIUC

    That's convenient! We can cast everything to isize, then. I can submit a separate PR once this one is done to add the conversions.

I think the current version of the implementation is pretty good, although I'm somewhat unsatisfied about the need to call .as_ref() (or some other method) when using .islice() with IxDyn arrays. What do you think about making only .islice() generic with a bound T: AsRef<D::SliceArg>? There's precedent for this in the standard library; see e.g. std::fs::File::open().

@bluss
Copy link
Member

bluss commented Oct 25, 2017

We can change islice like that later.

I think we've achieved a pretty good interface here. It may be hard for me to avoid tinkering with it after merging, I'd primarily see if there is a neater way to build the phantom dimension type.

For now, I'd put #[repr(C)] on the SliceInfo struct, should make the raw pointer cast we do even less problematic.

@bluss
Copy link
Member

bluss commented Oct 25, 2017

Naming of the slice methods. This may be too much of brainstorming, don't know if it works.

  • slice → slice_view
  • slice_mut → slice_view_mut
  • slice_into → slice
  • islice → slice_in_place

drawback: pretty breaking

@bluss
Copy link
Member

bluss commented Oct 25, 2017

We could also try slice_into → sliced, islice → slice_in_place and the other renames, leaving plain slice and slice_mut as deprecated?

@jturner314
Copy link
Member Author

Regarding naming, I think I prefer your first suggestion:

  • slice → slice_view
  • slice_mut → slice_view_mut
  • slice_into → slice
  • islice → slice_in_place

For islice, if you choose slice_in_place, I would also rename map_inplace and mapv_inplace to map_in_place and mapv_in_place. Otherwise, you could use slice_inplace.

Another option is islice → slice_mut, although the transition would be even more confusing.

Another option is

  • slice → slice or slice_view
  • slice_mut → slice_mut or slice_view_mut
  • slice_into → slice_move
  • islice → slice_inplace

which makes it easy for users to update their code. The _move suffix isn't used anywhere in the standard library AFAICT, but it is the recommended suffix for the owned variant according to RFC 199 if the default is an immutable borrow.

@jturner314
Copy link
Member Author

Yet another option is

  • .slice().view().slice()
  • .slice_mut().view_mut().slice()
  • .slice_into().slice()
  • .islice().slice_inplace()

I think my favorite option so far is:

  • .slice().slice()
  • .slice_mut().slice_mut()
  • .slice_into().slice_move()
  • .islice().slice_inplace()

@bluss
Copy link
Member

bluss commented Oct 26, 2017

Let's go with that. One concern is that slice_move is the generic move one, including for taking an ArrayView and slicing it in a lifetime-preserving way. .sliced() seems good, but it's not distinct enough to be well understood I guess.

@bluss
Copy link
Member

bluss commented Oct 26, 2017

Can we keep into_subview?

@jturner314
Copy link
Member Author

Can we keep into_subview?

Sure; I just assumed that you'd want it to name it like .slice_move(). I'll change it back to .into_subview().

@jturner314
Copy link
Member Author

I also renamed .isubview() to .subview_inplace(). Is that okay?

@bluss
Copy link
Member

bluss commented Oct 26, 2017

@jturner314 Yeah I think that's a good call, subview_inplace seems good.

@jturner314
Copy link
Member Author

Should the third member of SliceOrIndex::Slice be called the "step size" or the "stride" in the documentation?

@bluss
Copy link
Member

bluss commented Oct 26, 2017

step size sounds good, it doesn't mix the concepts. It's a multiplicative stride I guess? If we have a row stride of 12 and slice with step size 2, the new stride is 24.

@bluss
Copy link
Member

bluss commented Oct 28, 2017

Thanks a lot for working on this cool feature. Please update PR description and rebase & squash the commits so that we are ready to merge. I think it's ready, what do you think?

@bluss
Copy link
Member

bluss commented Oct 28, 2017

Oh that said, maybe I should look over if there are any non-breaking changes I can merge first.

@jturner314
Copy link
Member Author

I cleaned up the commit history and created a new ready-to-merge PR #377. I edited the history to try and make it as easy as possible to read (primarily by minimizing the size of the commit that adds SliceOrIndex, SliceInfo, etc.). I did make a few minor additional changes, the biggest one being moving do_slice and abs_index from dimension_trait.rs to dimension/mod.rs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants