Skip to content

Clang's Optimization Introduces Unexpected Sign Extension in RISC-V Bit-Field Operations #68855

@gyuminb

Description

@gyuminb

Environment:

  • Compiler: Clang-18
  • Target Architecture: RISC-V
  • Optimization Level: -O1, -O2, -O3
  • OS: (Ubuntu 22.04.2)

Summary:

While compiling code that deals with bit field operations and type casting, an unexpected behavior was noticed with optimization levels -O1, -O2, and -O3 in Clang for the RISC-V architecture. The behavior deviates from the expected results based on the C language standard and is not observed in the -O0 optimization level.

Steps to Reproduce:

  1. Compile the provided source code with Clang targeting RISC-V architecture.
  2. Use optimization levels -O1, -O2, or -O3.
  3. Execute the compiled binary.

Expected Result:

resultValue1: ffff
resultValue2: 0

Actual Result:

resultValue1: ffffffff
resultValue2: 0

Source Code to Reproduce:

#include<stdio.h>

typedef struct {
    unsigned int bitField : 13;
} CustomStruct;

unsigned int resultValue1 = 0;
short resultValue2 = 0;
CustomStruct customArray[2] = {{0U} , {0U}};

int main()
{
    resultValue1 = (unsigned int) ((unsigned short) (~(customArray[0].bitField)));
    printf("resultValue1: %x\n", resultValue1);

    resultValue2 = (short) (customArray[1].bitField);   
    printf("resultValue2: %x\n", resultValue2);
    return 0;
}

Observation:

The value for customArray[0].bitField is a 13-bit unsigned integer defined as a bit field. When all bits of this field are inverted using the ~ operator, all 13 bits are set to 1, producing a value of 0x1FFF.

Casting this value to (unsigned short) results in a 16-bit (2 bytes) value, which should then be 0xFFFF.

Further casting this value to (unsigned int) should maintain the value at 0xFFFF. This is the expected behavior as per the C language standard for type casting.

However, in the provided code, while this is the case without optimization (-O0), with optimization the value unexpectedly becomes 0xFFFFFFFF. It seems that after the cast to unsigned short, the extension to unsigned int isn't carried out correctly, possibly sign-extending rather than zero-extending the value.

This unexpected behavior suggests a potential issue with either a specific implementation of the RISC-V architecture or with this version of the Clang compiler. Such an action deviates from the expected behavior of standard C, indicating a probable compiler bug.

Additional Information:

  • https://godbolt.org/z/Pv3Gaacv9
  • The issue seems to stem from the slli and srli instructions used in succession in the optimized versions, resulting in sign-extension.

Recommendation:

Please verify the behavior observed using the provided Godbolt link and investigate the underlying cause in the Clang compiler for RISC-V. It's essential to ensure consistent behavior across optimization levels and adherence to the C language standard.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions