Skip to content

UBSan: Incorrect generated __ubsan_handle_type_mismatch_v1? #136592

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zanmato1984 opened this issue Apr 21, 2025 · 4 comments
Closed

UBSan: Incorrect generated __ubsan_handle_type_mismatch_v1? #136592

zanmato1984 opened this issue Apr 21, 2025 · 4 comments
Labels
compiler-rt:ubsan Undefined behavior sanitizer question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!

Comments

@zanmato1984
Copy link

In apache/arrow we encountered a weird UBSan error (apache/arrow#46124 (comment)) which seems to be a false alarm. A reduced case can be found here .

To summarize, for function:

uint64_t read64(const uint64_t* src, size_t n) {
    uint64_t result = 0;
    std::memcpy(&result, src, n);
    return result;
}

A misaligned src shouldn't be considered UB because it is merely passed into std::memcpy as void * which requires no alignment. However the generated code jumps to __ubsan_handle_type_mismatch_v1 once misaligned regardless of how it is used afterwards.

As a comparison, changing the pointer type from uint64_t * to uint8_t * gets the correct codegen - no alignment checking:

uint64_t read8(const uint8_t* src, size_t n) {
    uint64_t result = 0;
    std::memcpy(&result, src, n);
    return result;
}

The same behavior is observed on X86 as well, and as early as 18.1.0 (17.0.1 is fine) for both Arm and X86.

@EugeneZelenko EugeneZelenko added compiler-rt:ubsan Undefined behavior sanitizer and removed new issue labels Apr 21, 2025
@efriedma-quic
Copy link
Collaborator

This is not a false alarm: clang code generation actually will assume the input pointer is aligned in this case.

The most recent discussion of alignment and ubsan is https://discourse.llvm.org/t/rfc-enforcing-pointer-type-alignment-in-clang-and-ubsan/83922 .

@zanmato1984
Copy link
Author

zanmato1984 commented Apr 21, 2025

Thank you @efriedma-quic for the pointer. That makes sense to me. However I have a further question regarding to misaligned pointers.

This is not a false alarm: clang code generation actually will assume the input pointer is aligned in this case.

Is this saying that clang will assume any pointer to be properly aligned regardless of its usage? I.e., dereferencing (which would be a by-definition UB) or casting to other types of different alignment (in my case, to void * and passed into std::memcpy)?

@efriedma-quic
Copy link
Collaborator

Specifically, pointers that are arguments to load/store/memcpy/memset/etc. are assumed to be aligned. The implicit cast to void* doesn't count as a separate operation. See #67766.

@zanmato1984
Copy link
Author

See #67766.

This is even more helpful. Thanks!

I'm closing this issue.

@EugeneZelenko EugeneZelenko added the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label Apr 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler-rt:ubsan Undefined behavior sanitizer question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!
Projects
None yet
Development

No branches or pull requests

4 participants