Skip to content

Commit 1cff753

Browse files
committed
Auto merge of #31500 - steveklabnik:fix_cow, r=alexcrichton
When I last did a pass through the string documentation, I focused on consistency across similar functions. Unfortunately, I missed some details. This example was _too_ consistent: it wasn't actually accurate! This commit fixes the docs do both be more accurate and to explain why the return type is a Cow<'a, str>. First reported here: https://www.reddit.com/r/rust/comments/44q9ms/stringfrom_utf8_lossy_doesnt_return_a_string/
2 parents 75271d8 + 5089b43 commit 1cff753

File tree

1 file changed

+12
-11
lines changed

1 file changed

+12
-11
lines changed

src/libcollections/string.rs

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -479,16 +479,15 @@ impl String {
479479
}
480480
}
481481

482-
/// Converts a slice of bytes to a `String`, including invalid characters.
482+
/// Converts a slice of bytes to a string, including invalid characters.
483483
///
484-
/// A string slice ([`&str`]) is made of bytes ([`u8`]), and a slice of
485-
/// bytes ([`&[u8]`][byteslice]) is made of bytes, so this function converts between
486-
/// the two. Not all byte slices are valid string slices, however: [`&str`]
487-
/// requires that it is valid UTF-8. During this conversion,
484+
/// Strings are made of bytes ([`u8`]), and a slice of bytes
485+
/// ([`&[u8]`][byteslice]) is made of bytes, so this function converts
486+
/// between the two. Not all byte slices are valid strings, however: strings
487+
/// are required to be valid UTF-8. During this conversion,
488488
/// `from_utf8_lossy()` will replace any invalid UTF-8 sequences with
489489
/// `U+FFFD REPLACEMENT CHARACTER`, which looks like this: �
490490
///
491-
/// [`&str`]: ../primitive.str.html
492491
/// [`u8`]: ../primitive.u8.html
493492
/// [byteslice]: ../primitive.slice.html
494493
///
@@ -499,10 +498,13 @@ impl String {
499498
///
500499
/// [`from_utf8_unchecked()`]: struct.String.html#method.from_utf8_unchecked
501500
///
502-
/// If you need a [`&str`] instead of a `String`, consider
503-
/// [`str::from_utf8()`].
501+
/// This function returns a [`Cow<'a, str>`]. If our byte slice is invalid
502+
/// UTF-8, then we need to insert the replacement characters, which will
503+
/// change the size of the string, and hence, require a `String`. But if
504+
/// it's already valid UTF-8, we don't need a new allocation. This return
505+
/// type allows us to handle both cases.
504506
///
505-
/// [`str::from_utf8()`]: ../str/fn.from_utf8.html
507+
/// [`Cow<'a, str>`]: ../borrow/enum.Cow.html
506508
///
507509
/// # Examples
508510
///
@@ -512,8 +514,7 @@ impl String {
512514
/// // some bytes, in a vector
513515
/// let sparkle_heart = vec![240, 159, 146, 150];
514516
///
515-
/// // We know these bytes are valid, so we'll use `unwrap()`.
516-
/// let sparkle_heart = String::from_utf8(sparkle_heart).unwrap();
517+
/// let sparkle_heart = String::from_utf8_lossy(&sparkle_heart);
517518
///
518519
/// assert_eq!("💖", sparkle_heart);
519520
/// ```

0 commit comments

Comments
 (0)