Skip to content

return Option instead of assert!/fail! #15535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lucidd opened this issue Jul 8, 2014 · 18 comments
Closed

return Option instead of assert!/fail! #15535

lucidd opened this issue Jul 8, 2014 · 18 comments

Comments

@lucidd
Copy link
Contributor

lucidd commented Jul 8, 2014

There are quite a few functions on traits for slices that fail when given out of bounds parameters rather than returning an Option.

  • slice
  • slice_to
  • slice_from
  • slice_chars
  • tail
  • tailn
  • subslice_offset
  • char_range_at
  • char_range_at_reverse
  • char_at
  • char_at_reverse

There are probably a lot more functions that could change to Option rather than fail but those are the ones i found so far.
If this is something we would want to change, i would offer myself to work on that.

@alexcrichton
Copy link
Member

cc @aturon

@liigo
Copy link
Contributor

liigo commented Jul 9, 2014

fail! is bad if we can avoid by returning Option.

2014-07-09 5:51 GMT+08:00 Alex Crichton [email protected]:

cc @aturon https://github.com/aturon


Reply to this email directly or view it on GitHub
#15535 (comment).

by Liigo, http://blog.csdn.net/liigo/
Google+ https://plus.google.com/105597640837742873343/

@pnkfelix
Copy link
Member

pnkfelix commented Jul 9, 2014

I see failing here as analogous to failing in the slice indexing operator, v[idx]: its part of what we accept as programmers, that we have to get our index values right or else we risk failure.

(I guess if we were to add syntax a la RFC PR 13 for a slice operator, v[a..b], then we could then change things so that the methods listed in the description above return Option<T> while the slice operator itself returns T and fails otherwise. But unless RFC PR 13 lands, I think the changes suggested here would be a net loss.)

@lucidd
Copy link
Contributor Author

lucidd commented Jul 9, 2014

@pnkfelix you are right its kind of analogous. The problem with fail is you can't really react to it (like you could in other languages with exceptions) you have to prevent it so you need to put checks in front of every call you do just to let the function itself check it again.
To do this correctly you basically need to know what exactly each function is checking and replicate it. This is not even that trivial for simple functions like StrSlice slice which checks its parameters with this function.

fn is_char_boundary(&self, index: uint) -> bool {
        if index == self.len() { return true; }
        if index > self.len() { return false; }
        let b = self.as_bytes()[index];
        return b < 128u8 || b >= 192u8;
}

I would not come up with that check and i certainly would not want to write this myself every time i call slice or any other method that calls slice internally (which are a lot). I don't think you should have to think
about that at all. The functions themselves know way better what to check than i do so why not let them handle the checking (like they already do) and give the programmer a consistent way of handling "failure" by returning Option or Result.

I guess what i'm asking is a consistent way to handle errors through Option/Result instead of the current mix of some functions that return Option/Result and some you have to check the parameters before you do the call.

For the indexing and the slicing operator failure might be ok because you would expect that coming from other languages and it would be tedious to writing a[0].unwrap() but maybe if accessing a
value that might not be there and therefore fail (like unwrap) hat some syntax associated with it it could be made easier to write. The easier option would be to let those operators fail and have some alternative functions like get and slice that do the same but return Options instead of failing so the programmer can choose.

@pnkfelix
Copy link
Member

pnkfelix commented Jul 9, 2014

@lucidd The title of your ticket implies a broad change that I assume would affect not only string slices, but also array/vector slices.

But it sounds like your issue is really with string slices alone, in terms of needing to maintain the utf8 property.

Is that a correct understanding? If so, is there a different proposal you can think of that would address your string handling issues while leaving array/vector slices alone?

@lucy
Copy link
Contributor

lucy commented Jul 9, 2014

Array/vector slices have the same issue, just to a smaller extent. Adding an additional set of methods that return Option or Result instead of failing (like Haskell's safe package) seems like it would address this.

@lucidd
Copy link
Contributor Author

lucidd commented Jul 9, 2014

I used StrSlice just as an example. Like @lucy said arrays/vectors have the similar problems and i don't think there should be a difference in error handling between them. It should not really matter what i want to slice/index or whatever the error handling should be the same.

My main point is reducing the overall use of fail! and replace it with Options/Result to make it easier to react to errors. Functions on arrays/vectors and strings seemed to be a good way to start since they are commonly used and profit the most from easy error handling.

@aturon
Copy link
Member

aturon commented Jul 9, 2014

I've been working on a set of guidelines for Rust, which includes a section on signaling errors. It's a quite tricky issue!

A couple of thoughts beyond the guidelines:

First, it is possible to recover from fail! by using tasks. For example, you can create a child task with a channel connecting it to its parent, and have the parent use recv_opt on the channel to detect successful completion or failure.

Second, cases like incorrect indexing are generally thought to be programming errors. In particular, you avoid them not by adding redundant checks before each operation, but instead by programming in a way that naturally gives you valid inputs. For example, a traditional for loop over loop indices naturally builds in the bounds checking; you don't have to check prior to each indexing operation. Similarly for string slices: if you're going to index into utf8-encoded data, your logic should already guarantee that you're indexing at valid character locations.

One way to look at this is: if you did get an Err result back, what would you do with it? Since a bad index nearly always represents a bug, you'd probably either unwrap (gaining nothing) or just propagate the error up until a natural "recovery" point -- but that's exactly what task failure/recovery is intended to do!

More generally, it's not a goal for all kinds of errors to be treated in the same way. A file not found error is quite a different beast from out of memory or index out of bounds. The guidelines linked above lay out some general classification of errors, and the appropriate way to signal them.

@liigo
Copy link
Contributor

liigo commented Jul 10, 2014

+1. the same i wanted to say.

2014-07-09 22:58 GMT+08:00 Kevin Walter [email protected]:

@pnkfelix https://github.com/pnkfelix you are right its kind of
analogous. The problem with fail is you can't really react to it (like you
could in other languages with exceptions) you have to prevent it so you
need to put checks in front of every call you do just to let the function
itself check it again.
To do this correctly you basically need to know what exactly each function
is checking and replicate it. This is not even that trivial for simple
functions like StrSlice slice which checks its parameters with this
function.

fn is_char_boundary(&self, index: uint) -> bool {
if index == self.len() { return true; }
if index > self.len() { return false; }
let b = self.as_bytes()[index];
return b < 128u8 || b >= 192u8;
}

I would not come up with that check and i certainly would not want to
write this myself every time i call slice or any other method that calls
slice internally (which are a lot). I don't think you should have to think
about that at all. The functions themselves know way better what to check
than i do so why not let them handle the checking (like they already do)
and give the programmer a consistent way of handling "failure" by returning
Option or Result.

I guess what i'm asking is a consistent way to handle errors through
Option/Result instead of the current mix of some functions that return
Option/Result and some you have to check the parameters before you do the
call.

For the indexing and the slicing operator failure might be ok because you
would expect that coming from other languages and it would be tedious to
writing a[0].unwrap() but maybe if accessing a
value that might not be there and therefore fail (like unwrap) hat some
syntax associated with it it could be made easier to write. The easier
option would be to let those operators fail and have some alternative
functions like get and slice that do the same but return Options instead
of failing so the programmer can choose.


Reply to this email directly or view it on GitHub
#15535 (comment).

by Liigo, http://blog.csdn.net/liigo/
Google+ https://plus.google.com/105597640837742873343/

@lucidd
Copy link
Contributor Author

lucidd commented Jul 10, 2014

@aturon Thanks for the link. I know that you can catch fail on a tasks but i did not consider using a task just to catch a possible failure on a simple function to be a valid solution. Fail is currently used in a lot of places where other languages would throw exceptions. But fail is by no means an exception replacement if the only way to "catch" it is to use a separate task.

Here are a few problems i see in the current description of when to use fail!.

  • input validity can be easily checked

This it true for a lot of functions that already return Option including Vec::pop. I can just use
is_empty to test if i can pop or not. Checking that is even easier than making 2 bound checks before using slice.
Calling pop on an empty vector is as much a logic error as trying to slice/index out of bounds.
The decision to make one fail and the other return Option is rather arbitrary.
ImmutableVector has a get method that returns an option so why is it ok here to have
an indexing operation guarded with option?

  • parts of the API naturally produce valid inputs for other parts

There are of cause ways to get only valid inputs and not have to introduce redundant checks but
that's not always the case. For example if i want to parse some binary data and know byte 0 to 20 represent the header i need to do those redundant bounds checks, where if i would get back an Option
i would profit from the checks the function already does for me and only react to the possibility that there
might be no slice from 0 to 20.

  • invalid inputs clearly represent a logic error

like i said before there are already functions that return None on a clearly logical error like
ImmutableVector::get or Vec::pop.

I feel like the whole fail! is acceptable does not go well with the safety goals of rust. Programmers
have to read the documentation to see what functions can fail and why instead of being explicitly warned
by the type system. Option and Result make it clear a function might not return a value or result in an
Error. If we would reduce the use of fail! to functions like unwrap where there is really no other way than to fail we would limit the code to audit for failure just like unsafe does for unsafe code.

@huonw
Copy link
Member

huonw commented Jul 10, 2014

Avoiding fail! would be nice, but the ergonomics of all of these basic level functions returning Option/Result would likely be... horrible.

(We have moved to using Option/Result as much as possible for everything else.)

I feel like the whole fail! is acceptable does not go well with the safety goals of rust

Note that "safety" has a fairly specific meaning in Rust: memory safety. fail! does interact somewhat with memory safety (having to be careful to only run destructors once on each value), but that's a bit of an edge case, and a program that just quits isn't regarded as unsafe.

(Unfortunately, Rust can't solve every problem.)

@liigo
Copy link
Contributor

liigo commented Jul 11, 2014

I disagree with you. Fail! is more horrible than returning Option, IMO.
2014年7月11日 上午5:47于 "Huon Wilson" [email protected]写道:

Avoiding fail! would be nice, but the ergonomics of all of these basic
level functions returning Option/Result would likely be... horrible.

(We have moved to using Option/Result as much as possible for everything
else.)

I feel like the whole fail! is acceptable does not go well with the safety
goals of rust

Note that "safety" has a fairly specific meaning in Rust: memory safety.
fail! does interact somewhat with memory safety (having to be careful to
only run destructors once on each value), but that's a bit of an edge case,
and a program that just quits isn't regarded as unsafe.

(Unfortunately, Rust can't solve every problem.)


Reply to this email directly or view it on GitHub
#15535 (comment).

@pnkfelix
Copy link
Member

@liigo let's not turn the issue tracker into a series of "you're wrong" "no you're wrong" comments

These issues are not resolved by community votes.

Instead, try to provide new technical insight or a fresh perspective; those are much better for core team members wrestling with these issues.

@huonw huonw added the A-libs label Jul 11, 2014
@liigo
Copy link
Contributor

liigo commented Jul 11, 2014

See what Kevin Walter said that fail! is bad than options.

@liigo
Copy link
Contributor

liigo commented Jul 11, 2014

@pnkfelix I didn't said "you are wrong", i just said "i disagree with you".
I did agree with Kevin Walter who said why "fail! is bad than returning option", but no one tell us why "return option is horrible than fail!"

@pnkfelix
Copy link
Member

@lucidd Regarding your point here:

For the indexing and the slicing operator failure might be ok because you would expect that coming from other languages and it would be tedious to writing a[0].unwrap() but maybe if accessing a
value that might not be there and therefore fail (like unwrap) hat some syntax associated with it it could be made easier to write. The easier option would be to let those operators fail and have some alternative functions like get and slice that do the same but return Options instead of failing so the programmer can choose.

Indeed, both the original ~[T] and the current &[T] (aka slice) offer a fn get method that returns Option<&'a T> precisely so that people who want to avoid failure can do so via that method. The fact that the current Vec fn get method can fail is just a temporary wart that will (or at least should) go away when things are working with the user-defined index operators so that we can start using vec[idx] again on Vec just like we used to with ~[T]. (Perhaps this is precisely the point you are trying to make; it is not clear to me from what you have written above.)


As a follow-up to @aturon 's comment above (that I hope is consistent with the philosophy he is espousing):

If there exist functions in the stdlib that can fail! on some inputs and it is not trivial to check the inputs for validity, then that is a real problem. For those cases, I think that we should change return type for such functions to return Option (or at least add a variant that does).

But I do not regard the examples in the description as instances of the above, since the is_char_boundary method is already provided on StrSlice, so you can do this check, you do not need to replicate the logic within its implementation yourself. (And plus, as @aturon noted, hopefully when you are manipulating a string's contents based on its byte indices, your program logic will make such checks unnecessary anyway.)

If there exist functions where the check is trivial but the methods for doing such checks are not documented, then we should fix the documentation to say how to do the checks. StrSlice may be an instance of this, since i do not see in its fn slice documentation any mention of the fn is_char_boundary method.


As I said above, if we do add a slice operator syntax, then I would be much more amenable to making the non-operator methods return Option<T>. Indeed, it seems like it could be somewhat reasonable to try to impose a design rule that non-operator methods should try to prefer returning Option<T> over failure. (Although one would need to either allow for exceptions like unwrap, or add an operator for unwrap, which many have clamored for but we have for the most part resisted adding.)

@pnkfelix
Copy link
Member

See also discussion on PR #15652 starting here; in particular, there are some opinions presented there that imply that we might remove get etc outright rather than immediately change their return types to Option<&'a T> in one step. So its not clear exactly when in the future Vec::get will return Option, if at all.

@thestinger
Copy link
Contributor

If a condition can be easily and efficiently checked before calling the function, it makes sense to consider it a logic error. It makes sense for indexing (or slicing) to fail for arrays, but it's much less sensible for hash tables and trees because the only way to find out if it's going to work is to try an expensive operation.

@lucidd lucidd closed this as completed Feb 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants