Clang adds "noundef" annotation to char arguments

I think the way clang translates the following C code to LLVM IR is incorrect:
```C
char id(char c) {
    return c;
}

void my_memcpy(char *src, char *dst, int n) {
    for (int i = 0; i < n; i++) {
        dst[i] = id(src[i]);
    }
}
```
The resulting IR defines `id` as `@id(i8 noundef signext %0)`. The `noundef` is what I am concerned by. This translation means calling my_memcpy as follows leads to UB, since some of the bytes being copied here *are* undef or poison (namely, they are padding):
```C
struct S {
    uint8_t f1;
    uint16_t f2;
};

void testcase() {
    struct S s, s_copy;
    s.f1 = 0;
    s.f2 = 0;
    my_memcpy((char*)&s, (char*)&s_copy, sizeof(struct S));
}
```
If I understand the C standard correctly, this program (running `testcase`) is entirely well-defined. In particular, C17 6.2.6.1 §5 says (emphasis mine)
>  Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression **that does not have character type**, the behavior is undefined. 

But we are using a character type here. The standard also explicitly says in 6.2.6.2 that `char` types have no padding bits. So I don't think there is any room here for UB to arise when copying arbitrary data (including uninitialized memory) at `char` type. Therefore clang should not add `noundef` to character type variables.

Furthermore, I am not entirely sure what the status of [this proposal](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm) is, but if it has been accepted, then I am not sure that adding `noundef` to any other integer type is correct, either. That proposal states explicitly
> None of the integral types have extraordinary values.

And at least for C++, https://eel.is/c++draft/basic.fundamental#4 has a note on padding in integer types stating
> Padding bits have unspecified value, but cannot cause traps[.](https://eel.is/c++draft/basic.fundamental#4.sentence-4)

So, at least for C++, I cannot see a justification for why clang adds `noundef` to all integer types. For non-character integer types in C, the standard is not clear enough for me to be sure either way.

Cc @aqjune @nunoplopes 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clang adds "noundef" annotation to char arguments #56551

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clang adds "noundef" annotation to char arguments #56551

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions