Skip to content

Accessing qualified member of a structure or union rvalue allows creating qualified rvalues #96713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Halalaluyafail3 opened this issue Jun 25, 2024 · 12 comments · May be fixed by #96913
Open
Labels
c23 clang:frontend Language frontend issues, e.g. anything involving "Sema" confirmed Verified by a second party

Comments

@Halalaluyafail3
Copy link

Halalaluyafail3 commented Jun 25, 2024

The following code demonstrates the issue:

struct{
    const int c;
}foo();
puts(_Generic(foo().c,int:"A",const int:"B"));

foo returns a structure that contains a qualified member named c of type const int. Calling foo results in an rvalue structure, which when accessing one of the members via . results in an rvalue. Clang does not remove qualifiers or _Atomic (C23 removes _Atomic from rvalues as well, and GCC does this in earlier versions as well) here while GCC does and the result of DR 423 seems intended to remove cases of qualified rvalues. Additionally, Clang actually generates a warning here with the default warnings stating that the const int branch will never be taken (and then it takes it anyway). The wording doesn't state that the result is the non-atomic (C23) and unqualified member when getting the member though I assume that is a defect.

@EugeneZelenko EugeneZelenko added c23 clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed new issue labels Jun 25, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 25, 2024

@llvm/issue-subscribers-clang-frontend

Author: None (Halalaluyafail3)

The following code demonstrates the issue: ```c struct{ const int c; }foo(); puts(_Generic(foo().c,int:"A",const int:"B")); ``` `foo` returns a structure that contains a qualified member named `c` of type `const int`. Calling `foo` results in an rvalue structure, which when accessing one of the members via `.` results in an rvalue. Clang does not remove qualifiers or _Atomic (C23 removes _Atomic from rvalues as well, and GCC does this in earlier versions as well) here while GCC does and the result of DR 423 seems intended to remove cases of qualified rvalues. Additionally, Clang actually generates a warning here with the default warnings stating that the `const int` branch will never be taken (and then it takes it anyway). The wording doesn't state that the result is the non-atomic (C23) and unqualified member when getting the member though I assume that is a defect.

@shafik
Copy link
Collaborator

shafik commented Jun 26, 2024

CC @AaronBallman

@AaronBallman
Copy link
Collaborator

The return statement in foo() is what determines the value that is returned, and that is an rvalue (C23 6.8.7.5p3) of the unnamed structure type, which is unqualified. So if we're doing anything wrong, I think it has more to do with the . operator. C23 6.5.3.4p3 says:

A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member,93) and is an lvalue if the first expression is an lvalue. If the first expression has qualified type, the result has the so-qualified version of the type of the designated member.

So the member named here is not an lvalue because the returned structure object is not an lvalue. However, "if the first expression has qualified type" can be read to apply here and Clang's behavior is correct. I took a look at what other C compilers do and the results are wonderful:

Clang, Chibicc, EDG: do not strip the qualifier
GCC, MSVC, TCC, CompCert: strips the qualifier
SDCC: claims there are multiple matching associations

So I'm going to ask on the WG14 reflectors to see what folks think.

@AaronBallman
Copy link
Collaborator

I lied, I didn't need to go ask WG14. _Generic's controlling operand behaves as though lvalue conversion happens and that needs to drop the qualifiers anyway.

@AaronBallman AaronBallman added the confirmed Verified by a second party label Jun 26, 2024
@AaronBallman
Copy link
Collaborator

I lied again and have asked on WG14 because I think this is a standards wording defect or I'm reading the standard wrong. lvalue conversion only applies to lvalues, so if we do get an rvalue from the member access expression, then the lvalue conversion wouldn't drop qualifiers because no conversion occurs.

https://godbolt.org/z/W64hxa4jj shows the compiler divergence for this.

@Halalaluyafail3
Copy link
Author

Halalaluyafail3 commented Jun 27, 2024

Interestingly, it seems that all compilers agree that with an array member qualifiers should not be removed in this situation (after decay it is no longer a qualified rvalue, but it being a qualified rvalue can still be observed with typeof). Also, I have discovered that GCC does not remove qualifiers from the result of . with an rvalue left operand in situations where lvalue conversions do not apply. That is, using the example typeof(foo().c) is the type const int but typeof(0,foo().c) is the type int. I'm thinking about making a bug report to GCC for that since it seems strange for an rvalue to undergo lvalue conversions.

@AaronBallman
Copy link
Collaborator

Committee discussion on the WG14 reflectors suggests a few things: 1) qualified rvalues are a thing in C and that was a benign thing largely because qualifiers were never observable until _Generic, but now means we need more clarity in the standard; 2) _Generic should still strip those qualifiers even if it gets an rvalue. (So the typeof behavior observing the qualifiers on the rvalue is correct.)

AaronBallman added a commit to AaronBallman/llvm-project that referenced this issue Jun 27, 2024
It is possible to get a qualified rvalue in C. Consider:

  struct { cont int i; } foo();
  _Generic(foo().i, ...);

foo() returns an rvalue for the anonymous structure, the member access
expression then results in an rvalue of type const int. The lvalue to
rvalue conversion we perform on the controlling expression of the
generic selection expression then fails to strip any qualifiers because
the expression is already an rvalue.

Discussion on the WG14 reflectors suggested that the qualifiers should
still be stripped from the type of the controlling expression; the
standard should be corrected to make this more clear.

Fixes llvm#96713
@Halalaluyafail3
Copy link
Author

If typeof can be used to observe the qualifiers and qualified rvalues exist, then that creates situations like the following:

struct{
    const int c;
}foo();
typeof(+foo().c)x;
x=1;

Here I see three possible results:

  1. This program violates a constraint, because the operand of the + operator is not an arithmetic type (it is a qualified arithmetic type). Strictly reading the standard, I think this would be correct interpretation.
  2. This program violates a constraint, because x=1 is assigning to a const object. This is the interpretation that Clang uses.
  3. This program is fine, and the type of x here is int. This is the interpretation that GCC uses.

@AaronBallman
Copy link
Collaborator

I don't think that program violates a constraint. C23 6.2.5p31:

... The qualified or unqualified versions of a type are distinct types that belong to
the same type category and have the same representation and alignment requirements. ...

I think Clang's handling of that is correct because the function returns an rvalue, the . operator yields an lvalue of type const int, and the unary + operator only does integer promotions (which don't include lvalue conversion but could change the type) so the result is still has type const int, and that is the operand to typeof, which does not undergo lvalue conversion.

A slightly different example would be:

struct{
    const short c;
}foo();
typeof(+foo().c)x;
x=1;

where I believe you would generally get int and not const short because of integer promotions:

If the original type is not a bit-precise integer type (6.2.5): if an int can represent all
values of the original type (as restricted by the width, for a bit-field), the value is converted to an
int;50) otherwise, it is converted to an unsigned int.

And... Clang seems to be doing exactly that: https://godbolt.org/z/ndsjo9E7Y

@Halalaluyafail3
Copy link
Author

Halalaluyafail3 commented Aug 20, 2024

I don't think that program violates a constraint. C23 6.2.5p31:

... The qualified or unqualified versions of a type are distinct types that belong to
the same type category and have the same representation and alignment requirements. ...

The term 'type category' is defined as such:

A type is characterized by its type category, which is either the outermost derivation of a derived
type (as noted previously in this subclause in the construction of derived types), or the type itself if
the type consists of no derived types.

Section 6.2.5 "Types" Paragraph 30 N3220

From how I read this, the type category of const int is int. I don't see how that makes const int included in arithmetic types. Additionally, the standard uses the the term 'qualified or unqualified character type' (in the definition of va_arg) which wouldn't make sense if this was the case as it would be redundant.

I think Clang's handling of that is correct because the function returns an rvalue, the . operator yields an lvalue of type const int, and the unary + operator only does integer promotions (which don't include lvalue conversion but could change the type) so the result is still has type const int, and that is the operand to typeof, which does not undergo lvalue conversion.

I assume you mean that the . operator yields an rvalue here, if it didn't then you could take the address of it which Clang doesn't allow. Following your line of thinking that const int is included in arithmetic types, and that integer promotions don't change the type so that it remains a qualified rvalue; shouldn't the following also be invalid?

struct{
    const int c;
}foo();
typeof(foo().c+foo().c)x;
x=1;

The types of the operands to the + (binary) operator here are both const int and the usual arithmetic conversions are applied.

Otherwise, if any of the two types is an enumeration, it is converted to its underlying type.
Then, the integer promotions are performed on both operands. Next, the following rules are
applied to the promoted operands:

If both operands have the same type, then no further conversion is needed.

Section 6.3.1.8 "Usual arithmetic conversions" Paragraph 1 N3220

Using this, the result of the addition should be const int yet Clang results in int anyways. Also, the result of foo().c+0 would be unclear.

@AaronBallman
Copy link
Collaborator

From how I read this, the type category of const int is int. I don't see how that makes const int included in arithmetic types. Additionally, the standard uses the the term 'qualified or unqualified character type' (in the definition of va_arg) which wouldn't make sense if this was the case as it would be redundant.

Can you imagine an interpretation where int is an integer type and const int is not an integer type? However, I think we're off in the weeds -- the standard is very unclear on these points, and that's the whole point to this issue.

I assume you mean that the . operator yields an rvalue here.

Yes, sorry for the confusion!

Using this, the result of the addition should be const int yet Clang results in int anyways. Also, the result of foo().c+0 would be unclear.

I get the same interpretation as you do.

IMO, this topic requires a WG14 paper to tease out what we want the standard to say. The author of the typeof proposal was clearly looking for typeof(foo().c) to be const int according to discussions both on and off list, and that matches the intent of the paper. The point to typeof is "I want the declared type of the thing being inspected" while typeof_unqual is "I want the non-atomic, unqualified type of the thing being inspected", but there's obviously a lot of edge cases involved. Another example would be C's adjustments to function types: https://godbolt.org/z/dzeWMz563 (which Clang doesn't correctly handle currently: #39494)

@Halalaluyafail3
Copy link
Author

Halalaluyafail3 commented Aug 21, 2024

IMO, this topic requires a WG14 paper to tease out what we want the standard to say. The author of the typeof proposal was clearly looking for typeof(foo().c) to be const int according to discussions both on and off list, and that matches the intent of the paper. The point to typeof is "I want the declared type of the thing being inspected" while typeof_unqual is "I want the non-atomic, unqualified type of the thing being inspected", but there's obviously a lot of edge cases involved. Another example would be C's adjustments to function types: https://godbolt.org/z/dzeWMz563 (which Clang doesn't correctly handle currently: #39494)

I think a good solution would be to convert to the type after lvalue conversions, if the expression was used in a context where lvalue conversions would apply had the expression been an lvalue instead. Similar to your proposed solution for this with _Generic but more generally. This would still allow typeof(foo().c) to be const int, applying . again to propagate qualifiers, and sizeof to properly determine the size of atomic members while not requiring consideration of how qualifiers pass through operators which I don't think was intended. Consider the case of atomics:

struct{
    _Atomic int c;
}foo(),bar;
+bar.c;//OK, lvalue conversions remove _Atomic
+foo().c;//Clang rejects

Clang rejects this because + requires an arithmetic type and _Atomic int is not considered an arithmetic type. Having this be valid on an lvalue but not on a non-lvalue expression doesn't really make much sense (this situation also happens with casts and function calls which result in atomic types as well).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c23 clang:frontend Language frontend issues, e.g. anything involving "Sema" confirmed Verified by a second party
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants