Skip to content

cstr16: add method to get the underlying bytes #788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@
- `From<&CStr16>` for `String`
- `From<&CString16>` for `String`
- Added `RuntimeServices::get_variable_boxed` (requires the `alloc` feature).
- Added `CStr16::as_bytes`
- Added `AsRef<[u8]>` and `Borrow<[u8]>` for `Cstr8` and `CStr16`.

### Changed

Expand All @@ -44,6 +46,8 @@
- The `MEMORY_DESCRIPTOR_VERSION` constant has been moved to
`MemoryDescriptor::VERSION`.
- The `Revision` struct's one field is now public.
- Renamed `CStr8::to_bytes` to `CStr8::as_bytes` and changed the semantics:
The trailing null character is now always included in the returned slice.

## uefi-macros - [Unreleased]

Expand Down
2 changes: 1 addition & 1 deletion uefi-macros/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,7 @@ pub fn entry(args: TokenStream, input: TokenStream) -> TokenStream {
/// assert_eq!(cstr8!().to_u16_slice_with_nul(), [0]);
/// assert_eq!(cstr8!("").to_u16_slice_with_nul(), [0]);
/// // Non-empty string
/// assert_eq!(cstr8!("test").to_bytes_with_nul(), [116, 101, 115, 116, 0]);
/// assert_eq!(cstr8!("test").as_bytes(), [116, 101, 115, 116, 0]);
/// ```
#[proc_macro]
pub fn cstr8(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
Expand Down
60 changes: 51 additions & 9 deletions uefi/src/data_types/strs.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
use super::chars::{Char16, Char8, NUL_16, NUL_8};
use super::UnalignedSlice;
use crate::polyfill::maybe_uninit_slice_assume_init_ref;
use core::borrow::Borrow;
use core::ffi::CStr;
use core::iter::Iterator;
use core::mem::MaybeUninit;
Expand Down Expand Up @@ -118,16 +119,10 @@ impl CStr8 {
self.0.as_ptr()
}

/// Converts this CStr8 to a slice of bytes without the terminating null byte.
/// Returns the underlying bytes as slice including the terminating null
/// character.
#[must_use]
pub fn to_bytes(&self) -> &[u8] {
let chars = self.to_bytes_with_nul();
&chars[..chars.len() - 1]
}

/// Converts this CStr8 to a slice of bytes containing the trailing null byte.
#[must_use]
pub const fn to_bytes_with_nul(&self) -> &[u8] {
pub const fn as_bytes(&self) -> &[u8] {
unsafe { &*(&self.0 as *const [Char8] as *const [u8]) }
}
}
Expand All @@ -147,6 +142,18 @@ impl fmt::Display for CStr8 {
}
}

impl AsRef<[u8]> for CStr8 {
fn as_ref(&self) -> &[u8] {
self.as_bytes()
}
}

impl Borrow<[u8]> for CStr8 {
fn borrow(&self) -> &[u8] {
self.as_bytes()
}
}

impl<StrType: AsRef<str> + ?Sized> EqStrUntilNul<StrType> for CStr8 {
fn eq_str_until_nul(&self, other: &StrType) -> bool {
let other = other.as_ref();
Expand Down Expand Up @@ -389,6 +396,25 @@ impl CStr16 {
}
Ok(())
}

/// Returns the underlying bytes as slice including the terminating null
/// character.
#[must_use]
pub fn as_bytes(&self) -> &[u8] {
unsafe { slice::from_raw_parts(self.0.as_ptr().cast(), self.num_bytes()) }
}
}

impl AsRef<[u8]> for CStr16 {
fn as_ref(&self) -> &[u8] {
self.as_bytes()
}
}

impl Borrow<[u8]> for CStr16 {
fn borrow(&self) -> &[u8] {
self.as_bytes()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to leave out the AsRef/Borrow impls, as it would not be as clear when reading the code as an explicit call to to_bytes_with_nul.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much rather drop self.to_bytes() and rename self.to_bytes_with_nul() to self.to_bytes(). Then we can keep AsRef and Borrow. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that sounds reasonable. Also, I didn't think of this before, but it should probably be as_bytes instead of to_bytes (per the conversion naming guidelines).

For Borrow and AsRef, do you have an example of how they'd get used in practice? Like <CStr16 as AsRef<[u8]>>::as_ref(string) is pretty verbose so I assume a user would usually prefer to write string.as_bytes(). But perhaps there's some generic code where the former would be more usable? An example would help me clear up my doubts since I always get a bit confused with those two very-similar-seeming traits.

Copy link
Member Author

@phip1611 phip1611 May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. For example FileSystem::write() consumes content: impl AsRef<[u8]> as second parameter. This is equivalent to the standard lib's implementation.

Hence, you can just write a CStr16 to a file.

I only implemented both, AsRef and Borrow, as the standard library also does this for a lot of types (for example for std::path::Path). In my past years I've come accross a lot of code that used AsRef-types as parameters but Borrow only once or twice - Probably good to have both, tho.

Copy link
Member Author

@phip1611 phip1611 May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update Hm, now I'm confused.. https://doc.rust-lang.org/core/ffi/struct.CStr.html provides to_bytes() and to_bytes_with_nul() methods, but those method implementations are cheap (other than the naming convention implies).

But I do not see a strong reason why we should be consistent with https://doc.rust-lang.org/core/ffi/struct.CStr.html .. but..on the other hand.. we can also stay consistent with std::ffi::CStr and provide to_bytes(), to_bytes_until_nul(), and drop AsRef and Borrow, as it might be confusing then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I would guess the reason is this:

Note: This method is currently implemented as a constant-time cast, but it is planned to alter its definition in the future to perform the length calculation whenever this method is called.

Several CStr methods make note that the implementation of the type may change in the future to make it more FFI friendly. Not sure how likely that is to ever actually happen.

I'm realizing I don't have a very strong opinion here :)

How about: just provide the as_bytes method (so no separate with/without null methods), and also include the AsRef and Borrow impls?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

#[cfg(feature = "alloc")]
Expand Down Expand Up @@ -522,6 +548,14 @@ mod tests {
assert!(msg.eq_str_until_nul(cstr8));
}

#[test]
fn test_cstr8_as_bytes() {
let string: &CStr8 = cstr8!("a");
assert_eq!(string.as_bytes(), &[b'a', 0]);
assert_eq!(<CStr8 as AsRef<[u8]>>::as_ref(string), &[b'a', 0]);
assert_eq!(<CStr8 as Borrow<[u8]>>::borrow(string), &[b'a', 0]);
}

#[test]
fn test_cstr16_num_bytes() {
let s = CStr16::from_u16_with_nul(&[65, 66, 67, 0]).unwrap();
Expand Down Expand Up @@ -617,6 +651,14 @@ mod tests {
);
}

#[test]
fn test_cstr16_as_bytes() {
let string: &CStr16 = cstr16!("a");
assert_eq!(string.as_bytes(), &[b'a', 0, 0, 0]);
assert_eq!(<CStr16 as AsRef<[u8]>>::as_ref(string), &[b'a', 0, 0, 0]);
assert_eq!(<CStr16 as Borrow<[u8]>>::borrow(string), &[b'a', 0, 0, 0]);
}

// Code generation helper for the compare tests of our CStrX types against "str" and "String"
// from the standard library.
#[allow(non_snake_case)]
Expand Down
4 changes: 2 additions & 2 deletions uefi/src/proto/string/unicode_collation.rs
Original file line number Diff line number Diff line change
Expand Up @@ -103,12 +103,12 @@ impl UnicodeCollation {
fat: &CStr8,
buf: &'a mut [u16],
) -> Result<&'a CStr16, StrConversionError> {
if buf.len() < fat.to_bytes_with_nul().len() {
if buf.len() < fat.as_bytes().len() {
return Err(StrConversionError::BufferTooSmall);
}
(self.fat_to_str)(
self,
fat.to_bytes_with_nul().len(),
fat.as_bytes().len(),
fat.as_ptr(),
buf.as_ptr() as *mut _,
);
Expand Down