-
Notifications
You must be signed in to change notification settings - Fork 13.3k
String guide nits #17340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
What's the difference between |
There's 3 basic levels of unicode (and its encodings):
For UTF-8, like Rust's strings, the code units are bytes (
It is 45 bytes in UTF8, 26 codepoints but only 7 visible characters. The full breakdown is:
That is, each little diacritic/combining character is its own codepoint, but sequences of them are rendered together as a single visible unit. (NB. they aren't always separated, there are some precombined characters like |
Ahhh gotcha. When would you prefer |
If possible, text should be treated as a black box chunk of memory that can only be read/written (not iteration) and reencoded (almost always Obviously it isn't always always possible to avoid, e.g. an application may wish to be converting textual source code into machine code (but who would do that in Rust anyway? ;P ), in which case it's whatever makes sense for what it's doing. That compiler example probably wants But yes, it's mainly that you mentioned only 2 of 3. |
internal: Improve `find_path` performance cc rust-lang/rust-analyzer#17339, db80216dac3d972612d8e2d12ade3c28a1826ac2 should fix a case where we don't reduce our search space appropriately. This also adds a fuel system which really shouldn't ever be hit, hence why it warns
The codepoint iterator
chars
is not mentioned in http://doc.rust-lang.org/master/guide-strings.html#indexing-stringsThe use of
Str
in http://doc.rust-lang.org/master/guide-strings.html#generic-functions is not particularly idiomatic; sinceT
is at the 'top level' taking a&str
directly is fine; being generic overStr
is better when it may be expensive for the user to get to a&str
, e.g. if the string data is inside somethingIf the user has
&[String]
and wants to callfoo
, they're forced to allocate storage to store the.as_slice
's of each element, which can be arbitrarily expensive (they may have a lot ofString
s);bar
gets around this since it can be called with&[String]
or&[&str]
.(FWIW, this is actually still true for functions taking iterators, even though one can sometimes
.map(|s| s.as_slice())
to get aIterator<&str>
. If you have aIterator<String>
, there's no way to convert that into aIterator<&str>
without collecting into a temporary data structure.)The text was updated successfully, but these errors were encountered: