Skip to content

Add examples for type conversion #1123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 5, 2021
Merged

Add examples for type conversion #1123

merged 2 commits into from
Dec 5, 2021

Conversation

sgasse
Copy link
Contributor

@sgasse sgasse commented Nov 28, 2021

  • Add type_conversion.rs to illustrate some common conversions.
  • Update the documentation for numpy users.

];
unsafe {
let a_u8: Array<u8, _> = a_f32.mapv(|element| element.to_int_unchecked::<u8>());
assert_eq!(a_u8, array![1, 0, 254, 255, 254, 253, 0, 0]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written below, I don't think these results can be relied on outside of std.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I'll remove it.

@adamreichold
Copy link
Collaborator

I am not sure if this is too much for this MR, but similar to how we avoided the unchecked conversion, maybe we should include the infallible conversions based on From and the fallible conversions based TryFrom as well?

While as does not have undefined semantics in Rust any more, those two options seem preferable if applicable as they either will neither loose information guaranteed at compile time (From) or will make the program fail loudly at runtime if loss of information is detected (TryFrom and unwrap) which appears better suited than silently saturating for most scientific applications IMHO.

@sgasse
Copy link
Contributor Author

sgasse commented Dec 1, 2021

I am not sure if this is too much for this MR, but similar to how we avoided the unchecked conversion, maybe we should include the infallible conversions based on From and the fallible conversions based TryFrom as well?

While as does not have undefined semantics in Rust any more, those two options seem preferable if applicable as they either will neither loose information guaranteed at compile time (From) or will make the program fail loudly at runtime if loss of information is detected (TryFrom and unwrap) which appears better suited than silently saturating for most scientific applications IMHO.

Actually, I really like the idea, so thank you for pointing this out @adamreichold ! I completely refactored the examples so that From/TryFrom are the recommended ways. This is also updated in the 'for_numpy_users' documentation. Could you take another look?

Also, I was not able to figure out why the CI fails for my patch - how is the file that I added different from the other examples? If you can point me in the right direction, I'd be happy to fix it.

254.1, // rounded down to 254 by cutting the decimal part
-1.0, // saturated to 0 on the lower end
f32::INFINITY, // saturated to 255
f32::NAN, // converted to zero
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure @adamreichold - this is not UB, right? Otherwise, I'd at least add a comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is well-defined, c.f. https://doc.rust-lang.org/reference/expressions/operator-expr.html#type-cast-expressions (Generally speaking, Rust code that does not use unsafe will not have undefined behaviour. That casts of floating point numbers used to have was a bug which was resolved by fixing the saturating semantics.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A great, thank you for the link!

@adamreichold
Copy link
Collaborator

Also, I was not able to figure out why the CI fails for my patch - how is the file that I added different from the other examples? If you can point me in the right direction, I'd be happy to fix it.

Your example has a main function only if #[cfg(feature = "approx")], but the CI does not enable this feature. There are different ways to go about this but I think the easiest "fix" would be to add an empty main if the feature is missing, e.g.

#[cfg(not(feature = "approx"))]
fn main() {}

//
// Below, we illustrate three different approaches for the actual conversion
// in the closure.
// - `From` ensures perfect conversions known at compile time and is the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the official document refers to this as "lossless" instead of "perfect".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, now I understood the comment - thanks! Should be updated now.

// in the closure.
// - `From` ensures perfect conversions known at compile time and is the
// best default choice.
// - `TryFrom` either converts data perfectly or panics, ensuring that the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TryFrom performs a fallible conversion, whether the program panics or not is then up to the user of the trait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

assert_eq!(a_u8, array![120u8, 8u8, 0u8]);

// Simple upcasting with `as`
// Every `u8` fits perfectly into a `u32`, therefore we do not need to take
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In realistic programs, one often starts with this reasoning but especially with type interference the involved types can change and as becomes lossy even if it started out lossless. Hence, I think From should be recommended whenever possible. If the types changes and cannot be converted losslessly any more, compilation will fail and one can still decided to "downgrade" to a cast.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to have examples with as, so we should have some

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree that there should be examples, but I would recommend against choosing one where From could be used instead. The as example should rather do a potentially lossy conversion that cannot be done using From probably with a comment to use this if TryFrom is a performance issue or does not apply.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm the prime example for me would be f32 as i32 - but already have an example from floats to integers to show the saturating cast. With the updated comments/documentation, do you think it is fine or which example would you like to see in particular @adamreichold ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think it is fine or which example would you like to see in particular @adamreichold ?

I think it is fine. The other one rather common example (for example when working with integer coordinates on a grid) is between signed and unsigned types, e.g. usize as isize (which can fail as usize::MAX > isize::MAX as usize).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thank you for the example, I included it in type_conversion.rs 🙂

//! NumPy | `ndarray` | Notes
//! ------|-----------|------
//! `a.astype(np.float32)` | `a.mapv(\|x\| f32::from(x))` | convert `u8` array infallibly safe to `f32` array with `std::convert::From`, generally recommended
//! `a.astype(np.uint8)` | `a.mapv(\|x\| u8::try_from(x))` | try to convert `i8` array to `u8` array, panic if any value cannot be converted perfectly at runtime (e.g. negative value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would result in an array of Results as the intended .unwrap() is missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes absolutely, thank you!

@bluss
Copy link
Member

bluss commented Dec 2, 2021

I don't want to pile on instructions here, but I think keep it simple. For my sake, we can write anything into the example file (but word more carefully in the documentation - carefully means with facts). Maybe this link is useful to link to (std's docs for the as keyword also link it): https://doc.rust-lang.org/nightly/reference/expressions/operator-expr.html#type-cast-expressions I don't understand all the weird caution around as personally, it's a well-defined and useful operation.

Comment on lines 537 to 538
//! - The `as` keyword performs a *saturating* cast since Rust 1.45 and should
//! only be used for safe, clear conversions such as upcasting.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note about saturating casts is for floats only, not for all as conversions. We need a doc link anyway to explain the full rules, I think.

Suggested change
//! - The `as` keyword performs a *saturating* cast since Rust 1.45 and should
//! only be used for safe, clear conversions such as upcasting.
//! - The `as` keyword performs a cast which can convert between primitive types. [Other documentation links here]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a great suggestion, thank you!

@adamreichold
Copy link
Collaborator

I don't understand all the weird caution around as personally, it's a well-defined and useful operation.

I think that it wasn't until Rust 1.45 did not help its reputation. Leaving this aside, I see two kinds of problems when using as:

  • Together with unsafe code, it can lead to memory safety violations, for example by small negative integers becoming large positive integers, e.g. -1_isize as usize.
  • In numerical analyses, the fact that intermediate values saturated or wrapped might not be easily visible in the final results as they can manifest only by e.g. biasing a statistical measure.

@sgasse
Copy link
Contributor Author

sgasse commented Dec 2, 2021

Also, I was not able to figure out why the CI fails for my patch - how is the file that I added different from the other examples? If you can point me in the right direction, I'd be happy to fix it.

Your example has a main function only if #[cfg(feature = "approx")], but the CI does not enable this feature. There are different ways to go about this but I think the easiest "fix" would be to add an empty main if the feature is missing, e.g.

#[cfg(not(feature = "approx"))]
fn main() {}

Ah yes, that makes sense, thanks!

@sgasse
Copy link
Contributor Author

sgasse commented Dec 2, 2021

Thank you both @adamreichold and @bluss for the thorough review! I tried to strike a balance between warning about potential issues when using as while still providing some examples with it.
@bluss are the examples/docu updates in general too verbose for your taste or are they fine?

@bluss
Copy link
Member

bluss commented Dec 3, 2021

It's good. I could send in my own edits later if I have some opinions about wording.

@sgasse
Copy link
Contributor Author

sgasse commented Dec 3, 2021

It's good. I could send in my own edits later if I have some opinions about wording.

Alright, sounds good! 🙂 And I hope I finally got the feature guards right for CI ^^ Sorry for the mess-up there, should have run the all-tests.sh script earlier on my machine.

//
// Below, we illustrate three different approaches for the actual conversion
// in the closure.
// - `From` ensures perfect conversions known at compile time and is the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be missing?

@sgasse
Copy link
Contributor Author

sgasse commented Dec 4, 2021

Thank you for proof-reading @bluss ! I included all your comments - what exactly do you mean seems to be missing? An example with From can be found here. Or do I use From in the wrong way?
EDIT: Understood the comment now and updated.

Unfortunately, I cannot reproduce the error in the clippy run from the last workflow - but the error messages also seem unrelated to my patch.

- Add `type_conversion.rs` to illustrate some common conversions.
- Update the documentation for numpy users.
@bluss bluss added this to the 0.15.x milestone Dec 5, 2021
@bluss
Copy link
Member

bluss commented Dec 5, 2021

Thanks!

@bluss bluss merged commit 735afb0 into rust-ndarray:master Dec 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants