Description
This is a bug first reported in #78034 (comment).
But fork it here to be clear it's not related to c23 standard.
The problem is about using {} to initialize a static variable with union type in the linux kernel (https://github.com/torvalds/linux/blob/master/net/xfrm/xfrm_state.c#L1142):
typedef union {
__be32 a4;
__be32 a6[4];
struct in6_addr in6;
} xfrm_address_t;
struct xfrm_state* xfrm_state_find() {
static xfrm_address_t saddr_wildcard = {};
}
Then the code uses all bytes of saddr_wildcard to generate a hash value.
But in llvm IR, saddr_wildcard only has its first field zero initialized, left bits are marked as undef.
@saddr_wildcard = internal global { i32, [12 x i8] } { i32 0, [12 x i8] undef }, align 4, !dbg !9
With some optimization flags (like -O2 with always_inline attribute), clang decides to replace the undef part of saddr_wildcard with undef or poison values (GlobalOptPass), remove instructions for using undef and poison values (InstCombinePass), and give a wrong hash result.
I also reported this problem to the linux kernel. Here is what Linus Torvalds replied in https://www.spinics.net/lists/netdev/msg1007244.html:
In the kernel, we do expect initializers that always initialize the
whole variable fully.
This is literally about "the linux kernel expects initializers to
FULLY initialize variables". Padding, other union members, you name
it.
If clang doesn't do that, then clang is buggy as far as the kernel is
concerned, and no amount of standards reading is relevant.
And in particular, no amount of "but empty initializer" is relevant.
In my understanding, the Linux kernel expects:
- When an initailizer is used for variable, it expects all unspecified bytes (including padding) are initialized to zero.
- It's not limited to union, but also other aggregate types: struct, array.