Skip to content

Commit 99a563a

Browse files
erichkeaneIanWood1
authored andcommitted
[OpenACC] Switch Clang to use the Flang 'appertainment' rules for cla… (llvm#135372)
…uses The Flang implemenation of OpenACC uses a .td file in the llvm/Frontend directory to determine appertainment in 4 categories: -Required: If this list has items in it, the directive requires at least 1 of these be present. -AllowedExclusive: Items on this list are all allowed, but only 1 from the list may be here (That is, they are exclusive of eachother). -AllowedOnce: Items on this list are all allowed, but may not be duplicated. Allowed: Items on this list are allowed. Note th at the actual list of 'allowed' is all 4 of these lists together. This is a draft patch to swtich Clang over to use those tables. Surgery to get this to happen in Clang Sema was somewhat reasonable. However, some gaps in the implementations are obvious, the existing clang implementation disagrees with the Flang interpretation of it. SO, we're keeping a task list here based on what gets discovered. Changes to Clang: - [x] Switch 'directive-kind' enum conversions to use tablegen See ff1a7bd - [x] Switch 'clause-kind' enum conversions to use tablegen See ff1a7bd - [x] Investigate 'parse' test differences to see if any new disagreements arise. - [x] Clang/Flang disagree as to whether 'collapse' can be multiple times on a loop. Further research showed no prose to limit this, and the comment on the clang implementation said "no good reason to allow", so no standards justification. - [x] Clang/Flang disagree whether 'num_gangs' can appear >1 on a compute/combined construct. This ended up being an unjustified restriction. - [x] Clang/Flang disagree as to the list of required clauses on a 'set' construct. My research shows that Clang mistakenly included 'if' in the list, and that it should be just 'default_async', 'device_num', and 'device_type'. - [x] Order of 'at least one of' diagnostic has changed. Tests were updated. - [x] Ensure we are properly 'de-aliasing' clause names in appertainment checks? - [x] What is 'shortloop'? 'shortloop' seems to be an old non-standard extension that isn't supported by flang, but is parsed for backward compat reasons. Clang won't parse, but we at least have a spot for it in the clause list. - [x] Implemented proposed change for 'routine' gang/worker/vector/seq. (see issue 539) - [x] Implement init/shutdown can only have 1 'if' (see issue 540) - [x] Clang/Flang disagree as to whether 'tile' is permitted more than once on a 'loop' or combined constructs (Flang prohibits >1). I see no justification for this in the standard. EDIT: I found a comment in clang that I did this to make SOMETHING around duplicate checks easier. Discussion showed we should actually have a better behavior around 'device_type' and duplicates, so I've since implemented that. - [x] Clang/Flang disagree whether 'gang', 'worker', or 'vector' may appear on the same construct as a 'seq' on a 'loop' or 'combined'. There is prose for this in 2022: (a gang, worker, or vector clause may not appear if a 'seq' clause appears). EDIT: These don't actually disagree, but aren't in the .td file, so I restored the existing code to do this. - [x] Clang/Flang disagree on whether 'bind' can appear >1 on a 'routine'. I believe line 3096 (A bind clause may not bind to a routine name that has a visible bind clause) makes this limitation (Flang permits >1 bind). we discussed and decided this should have the same rules as worker/vector/etc, except without the 'exactly 1 of' rule (so no dupes in individual sections). - [x] Clang/Flang disagree on whether 'init'/'shutdown' can have multiple 'device_num' clauses. I believe there is no supporting prose for this limitation., We decided that `device_num` should only happen 1x. - [x] Clang/Flang disagree whether 'num_gangs' can appear >1 on a 'kernels' construct. Line 1173 (On a kernels construct, the num_gangs clause must have a single argument) justifies limiting on a per-arguement basis, but doesn't do so for multiple num_gangs clauses. WE decided to do this with the '1-per-device-type' region for num_gangs, num_workers, and vector_length, see openacc bug here: https://github.com/OpenACC/openacc-spec/issues/541 Changes to Flang: - [x] Clang/Flang disgree on whether 'atomic' can take an 'if' clause. This was added in OpenACC3.3_Next See llvm#135451 - [x] Clang/Flang disagree on whether 'finalize' can be allowed >1 times on a 'exit_data' construct. see llvm#135415. - [x] Clang/Flang disagree whether 'if_present' should be allowed >1 times on a 'host_data'/'update' construct. see llvm#135422 - [x] Clang/Flang disagree on whether 'init'/'shutdown' can have multiple 'device_type' clauses. I believe there is no supporting prose for this limitation. - [ ] SEE change for num_gangs/etc above. Changes that need discussion/research:
1 parent 9e76f39 commit 99a563a

File tree

51 files changed

+1727
-1048
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+1727
-1048
lines changed

clang/include/clang/Basic/DiagnosticSemaKinds.td

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12961,6 +12961,19 @@ def err_acc_clause_after_device_type
1296112961
def err_acc_clause_cannot_combine
1296212962
: Error<"OpenACC clause '%0' may not appear on the same construct as a "
1296312963
"'%1' clause on a '%2' construct">;
12964+
def err_acc_clause_routine_cannot_combine_before_device_type
12965+
: Error<"OpenACC clause '%0' after 'device_type' clause on a 'routine' "
12966+
"conflicts with the '%1' clause before the first 'device_type'">;
12967+
def err_acc_clause_routine_cannot_combine_same_device_type
12968+
: Error<"OpenACC clause '%0' on a 'routine' directive conflicts with the "
12969+
"'%1' clause applying to the same 'device_type'">;
12970+
def err_acc_clause_routine_one_of_in_region
12971+
: Error<"OpenACC 'routine' construct must have at least one 'gang', 'seq', "
12972+
"'vector', or 'worker' clause that applies to each 'device_type'">;
12973+
def err_acc_clause_since_last_device_type
12974+
: Error<"OpenACC '%0' clause cannot appear more than once%select{| in a "
12975+
"'device_type' region}2 on a '%1' directive">;
12976+
1296412977
def err_acc_reduction_num_gangs_conflict
1296512978
: Error<"OpenACC '%1' clause %select{|with more than 1 argument }0may not "
1296612979
"appear on a '%2' construct "
@@ -12978,9 +12991,6 @@ def note_acc_reduction_composite_member_loc : Note<"invalid field is here">;
1297812991
def err_acc_loop_not_for_loop
1297912992
: Error<"OpenACC '%0' construct can only be applied to a 'for' loop">;
1298012993
def note_acc_construct_here : Note<"'%0' construct is here">;
12981-
def err_acc_loop_spec_conflict
12982-
: Error<"OpenACC clause '%0' on '%1' construct conflicts with previous "
12983-
"data dependence clause">;
1298412994
def err_acc_collapse_loop_count
1298512995
: Error<"OpenACC 'collapse' clause loop count must be a %select{constant "
1298612996
"expression|positive integer value, evaluated to %1}0">;
@@ -13032,9 +13042,10 @@ def err_acc_gang_reduction_numgangs_conflict
1303213042
"'%1' clause %select{inside a compute construct with a|and a}3 "
1303313043
"'num_gangs' clause with more than one argument">;
1303413044

13035-
def err_reduction_op_mismatch
13045+
def err_reduction_op_mismatch
1303613046
: Error<"OpenACC 'reduction' variable must have the same operator in all "
1303713047
"nested constructs (%0 vs %1)">;
13048+
1303813049
def err_acc_loop_variable_type
1303913050
: Error<"loop variable of loop associated with an OpenACC '%0' construct "
1304013051
"must be of integer, pointer, or random-access-iterator type (is "

clang/include/clang/Basic/OpenACCKinds.h

Lines changed: 90 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -202,133 +202,140 @@ inline llvm::raw_ostream &operator<<(llvm::raw_ostream &Out,
202202
return printOpenACCAtomicKind(Out, AK);
203203
}
204204

205-
/// Represents the kind of an OpenACC clause.
205+
/// Represents the kind of an OpenACC clause. Sorted alphabetically, since this
206+
/// order ends up influencing the sorting of the list diagnostic.
206207
enum class OpenACCClauseKind : uint8_t {
207-
/// 'finalize' clause, allowed on 'exit data' directive.
208-
Finalize,
209-
/// 'if_present' clause, allowed on 'host_data' and 'update' directives.
210-
IfPresent,
211-
/// 'seq' clause, allowed on 'loop' and 'routine' directives.
212-
Seq,
213-
/// 'independent' clause, allowed on 'loop' directives.
214-
Independent,
208+
/// 'async' clause, allowed on Compute, Data, 'update', 'wait', and Combined
209+
/// constructs.
210+
Async,
211+
/// 'attach' clause, allowed on Compute and Combined constructs, plus 'data'
212+
/// and 'enter data'.
213+
Attach,
215214
/// 'auto' clause, allowed on 'loop' directives.
216215
Auto,
217-
/// 'worker' clause, allowed on 'loop', Combined, and 'routine' directives.
218-
Worker,
219-
/// 'vector' clause, allowed on 'loop', Combined, and 'routine' directives.
220-
Vector,
221-
/// 'nohost' clause, allowed on 'routine' directives.
222-
NoHost,
223-
/// 'default' clause, allowed on parallel, serial, kernel (and compound)
224-
/// constructs.
225-
Default,
226-
/// 'if' clause, allowed on all the Compute Constructs, Data Constructs,
227-
/// Executable Constructs, and Combined Constructs.
228-
If,
229-
/// 'self' clause, allowed on Compute and Combined Constructs, plus 'update'.
230-
Self,
216+
/// 'bind' clause, allowed on routine constructs.
217+
Bind,
218+
/// 'collapse' clause, allowed on 'loop' and Combined constructs.
219+
Collapse,
231220
/// 'copy' clause, allowed on Compute and Combined Constructs, plus 'data' and
232221
/// 'declare'.
233222
Copy,
234223
/// 'copy' clause alias 'pcopy'. Preserved for diagnostic purposes.
235224
PCopy,
236225
/// 'copy' clause alias 'present_or_copy'. Preserved for diagnostic purposes.
237226
PresentOrCopy,
238-
/// 'use_device' clause, allowed on 'host_data' construct.
239-
UseDevice,
240-
/// 'attach' clause, allowed on Compute and Combined constructs, plus 'data'
241-
/// and 'enter data'.
242-
Attach,
227+
/// 'copyin' clause, allowed on Compute and Combined constructs, plus 'data',
228+
/// 'enter data', and 'declare'.
229+
CopyIn,
230+
/// 'copyin' clause alias 'pcopyin'. Preserved for diagnostic purposes.
231+
PCopyIn,
232+
/// 'copyin' clause alias 'present_or_copyin'. Preserved for diagnostic
233+
/// purposes.
234+
PresentOrCopyIn,
235+
/// 'copyout' clause, allowed on Compute and Combined constructs, plus 'data',
236+
/// 'exit data', and 'declare'.
237+
CopyOut,
238+
/// 'copyout' clause alias 'pcopyout'. Preserved for diagnostic purposes.
239+
PCopyOut,
240+
/// 'copyout' clause alias 'present_or_copyout'. Preserved for diagnostic
241+
/// purposes.
242+
PresentOrCopyOut,
243+
/// 'create' clause, allowed on Compute and Combined constructs, plus 'data',
244+
/// 'enter data', and 'declare'.
245+
Create,
246+
/// 'create' clause alias 'pcreate'. Preserved for diagnostic purposes.
247+
PCreate,
248+
/// 'create' clause alias 'present_or_create'. Preserved for diagnostic
249+
/// purposes.
250+
PresentOrCreate,
251+
/// 'default' clause, allowed on parallel, serial, kernel (and compound)
252+
/// constructs.
253+
Default,
254+
/// 'default_async' clause, allowed on 'set' construct.
255+
DefaultAsync,
243256
/// 'delete' clause, allowed on the 'exit data' construct.
244257
Delete,
245258
/// 'detach' clause, allowed on the 'exit data' construct.
246259
Detach,
247260
/// 'device' clause, allowed on the 'update' construct.
248261
Device,
262+
/// 'device_num' clause, allowed on 'init', 'shutdown', and 'set' constructs.
263+
DeviceNum,
249264
/// 'deviceptr' clause, allowed on Compute and Combined Constructs, plus
250265
/// 'data' and 'declare'.
251266
DevicePtr,
252267
/// 'device_resident' clause, allowed on the 'declare' construct.
253268
DeviceResident,
269+
/// 'device_type' clause, allowed on Compute, 'data', 'init', 'shutdown',
270+
/// 'set', update', 'loop', 'routine', and Combined constructs.
271+
DeviceType,
272+
/// 'dtype' clause, an alias for 'device_type', stored separately for
273+
/// diagnostic purposes.
274+
DType,
275+
/// 'finalize' clause, allowed on 'exit data' directive.
276+
Finalize,
254277
/// 'firstprivate' clause, allowed on 'parallel', 'serial', 'parallel loop',
255278
/// and 'serial loop' constructs.
256279
FirstPrivate,
280+
/// 'gang' clause, allowed on 'loop' and Combined constructs.
281+
Gang,
257282
/// 'host' clause, allowed on 'update' construct.
258283
Host,
284+
/// 'if' clause, allowed on all the Compute Constructs, Data Constructs,
285+
/// Executable Constructs, and Combined Constructs.
286+
If,
287+
/// 'if_present' clause, allowed on 'host_data' and 'update' directives.
288+
IfPresent,
289+
/// 'independent' clause, allowed on 'loop' directives.
290+
Independent,
259291
/// 'link' clause, allowed on 'declare' construct.
260292
Link,
261293
/// 'no_create' clause, allowed on allowed on Compute and Combined constructs,
262294
/// plus 'data'.
263295
NoCreate,
296+
/// 'nohost' clause, allowed on 'routine' directives.
297+
NoHost,
298+
/// 'num_gangs' clause, allowed on 'parallel', 'kernels', parallel loop', and
299+
/// 'kernels loop' constructs.
300+
NumGangs,
301+
/// 'num_workers' clause, allowed on 'parallel', 'kernels', parallel loop',
302+
/// and 'kernels loop' constructs.
303+
NumWorkers,
264304
/// 'present' clause, allowed on Compute and Combined constructs, plus 'data'
265305
/// and 'declare'.
266306
Present,
267307
/// 'private' clause, allowed on 'parallel', 'serial', 'loop', 'parallel
268308
/// loop', and 'serial loop' constructs.
269309
Private,
270-
/// 'copyout' clause, allowed on Compute and Combined constructs, plus 'data',
271-
/// 'exit data', and 'declare'.
272-
CopyOut,
273-
/// 'copyout' clause alias 'pcopyout'. Preserved for diagnostic purposes.
274-
PCopyOut,
275-
/// 'copyout' clause alias 'present_or_copyout'. Preserved for diagnostic
276-
/// purposes.
277-
PresentOrCopyOut,
278-
/// 'copyin' clause, allowed on Compute and Combined constructs, plus 'data',
279-
/// 'enter data', and 'declare'.
280-
CopyIn,
281-
/// 'copyin' clause alias 'pcopyin'. Preserved for diagnostic purposes.
282-
PCopyIn,
283-
/// 'copyin' clause alias 'present_or_copyin'. Preserved for diagnostic
284-
/// purposes.
285-
PresentOrCopyIn,
286-
/// 'create' clause, allowed on Compute and Combined constructs, plus 'data',
287-
/// 'enter data', and 'declare'.
288-
Create,
289-
/// 'create' clause alias 'pcreate'. Preserved for diagnostic purposes.
290-
PCreate,
291-
/// 'create' clause alias 'present_or_create'. Preserved for diagnostic
292-
/// purposes.
293-
PresentOrCreate,
294310
/// 'reduction' clause, allowed on Parallel, Serial, Loop, and the combined
295311
/// constructs.
296312
Reduction,
297-
/// 'collapse' clause, allowed on 'loop' and Combined constructs.
298-
Collapse,
299-
/// 'bind' clause, allowed on routine constructs.
300-
Bind,
313+
/// 'self' clause, allowed on Compute and Combined Constructs, plus 'update'.
314+
Self,
315+
/// 'seq' clause, allowed on 'loop' and 'routine' directives.
316+
Seq,
317+
/// 'tile' clause, allowed on 'loop' and Combined constructs.
318+
Tile,
319+
/// 'use_device' clause, allowed on 'host_data' construct.
320+
UseDevice,
321+
/// 'vector' clause, allowed on 'loop', Combined, and 'routine' directives.
322+
Vector,
301323
/// 'vector_length' clause, allowed on 'parallel', 'kernels', 'parallel loop',
302324
/// and 'kernels loop' constructs.
303325
VectorLength,
304-
/// 'num_gangs' clause, allowed on 'parallel', 'kernels', parallel loop', and
305-
/// 'kernels loop' constructs.
306-
NumGangs,
307-
/// 'num_workers' clause, allowed on 'parallel', 'kernels', parallel loop',
308-
/// and 'kernels loop' constructs.
309-
NumWorkers,
310-
/// 'device_num' clause, allowed on 'init', 'shutdown', and 'set' constructs.
311-
DeviceNum,
312-
/// 'default_async' clause, allowed on 'set' construct.
313-
DefaultAsync,
314-
/// 'device_type' clause, allowed on Compute, 'data', 'init', 'shutdown',
315-
/// 'set', update', 'loop', 'routine', and Combined constructs.
316-
DeviceType,
317-
/// 'dtype' clause, an alias for 'device_type', stored separately for
318-
/// diagnostic purposes.
319-
DType,
320-
/// 'async' clause, allowed on Compute, Data, 'update', 'wait', and Combined
321-
/// constructs.
322-
Async,
323-
/// 'tile' clause, allowed on 'loop' and Combined constructs.
324-
Tile,
325-
/// 'gang' clause, allowed on 'loop' and Combined constructs.
326-
Gang,
327326
/// 'wait' clause, allowed on Compute, Data, 'update', and Combined
328327
/// constructs.
329328
Wait,
329+
/// 'worker' clause, allowed on 'loop', Combined, and 'routine' directives.
330+
Worker,
330331

331-
/// Represents an invalid clause, for the purposes of parsing.
332+
/// 'shortloop' is represented in the ACC.td file, but isn't present in the
333+
/// standard. This appears to be an old extension for the nvidia fortran
334+
// compiler, but seemingly not elsewhere. Put it here as a placeholder, but it
335+
// is never expected to be generated.
336+
Shortloop,
337+
/// Represents an invalid clause, for the purposes of parsing. Should be
338+
/// 'last'.
332339
Invalid,
333340
};
334341

@@ -485,6 +492,9 @@ inline StreamTy &printOpenACCClauseKind(StreamTy &Out, OpenACCClauseKind K) {
485492
case OpenACCClauseKind::Wait:
486493
return Out << "wait";
487494

495+
case OpenACCClauseKind::Shortloop:
496+
llvm_unreachable("Shortloop shouldn't be generated in clang");
497+
LLVM_FALLTHROUGH;
488498
case OpenACCClauseKind::Invalid:
489499
return Out << "<invalid>";
490500
}

clang/include/clang/Sema/SemaOpenACC.h

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ class SemaOpenACC : public SemaBase {
7878
/// Collapse has an 'N' count that makes it apply to a number of loops 'below'
7979
/// it.
8080
struct CollapseCheckingInfo {
81-
OpenACCCollapseClause *ActiveCollapse = nullptr;
81+
const OpenACCCollapseClause *ActiveCollapse = nullptr;
8282

8383
/// This is a value that maintains the current value of the 'N' on the
8484
/// current collapse, minus the depth that has already been traversed. When
@@ -107,7 +107,7 @@ class SemaOpenACC : public SemaBase {
107107
/// own counting of elements.
108108
UnsignedOrNone CurTileCount = std::nullopt;
109109

110-
/// Records whether we've hit a 'CurTileCount' of '0' on the wya down,
110+
/// Records whether we've hit a 'CurTileCount' of '0' on the way down,
111111
/// which allows us to diagnose if the number of arguments is too large for
112112
/// the current number of 'for' loops.
113113
bool TileDepthSatisfied = true;
@@ -177,6 +177,21 @@ class SemaOpenACC : public SemaBase {
177177

178178
void CheckLastRoutineDeclNameConflict(const NamedDecl *ND);
179179

180+
bool DiagnoseRequiredClauses(OpenACCDirectiveKind DK, SourceLocation DirLoc,
181+
ArrayRef<const OpenACCClause *> Clauses);
182+
183+
bool DiagnoseAllowedClauses(OpenACCDirectiveKind DK, OpenACCClauseKind CK,
184+
SourceLocation ClauseLoc);
185+
186+
public:
187+
// Needed from the visitor, so should be public.
188+
bool DiagnoseAllowedOnceClauses(OpenACCDirectiveKind DK, OpenACCClauseKind CK,
189+
SourceLocation ClauseLoc,
190+
ArrayRef<const OpenACCClause *> Clauses);
191+
bool DiagnoseExclusiveClauses(OpenACCDirectiveKind DK, OpenACCClauseKind CK,
192+
SourceLocation ClauseLoc,
193+
ArrayRef<const OpenACCClause *> Clauses);
194+
180195
public:
181196
ComputeConstructInfo &getActiveComputeConstructInfo() {
182197
return ActiveComputeConstructInfo;

clang/lib/AST/OpenACCClause.cpp

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -190,18 +190,15 @@ OpenACCCollapseClause::OpenACCCollapseClause(SourceLocation BeginLoc,
190190
SourceLocation EndLoc)
191191
: OpenACCClauseWithSingleIntExpr(OpenACCClauseKind::Collapse, BeginLoc,
192192
LParenLoc, LoopCount, EndLoc),
193-
HasForce(HasForce) {
194-
assert(LoopCount && "LoopCount required");
195-
}
193+
HasForce(HasForce) {}
196194

197195
OpenACCCollapseClause *
198196
OpenACCCollapseClause::Create(const ASTContext &C, SourceLocation BeginLoc,
199197
SourceLocation LParenLoc, bool HasForce,
200198
Expr *LoopCount, SourceLocation EndLoc) {
201-
assert(
202-
LoopCount &&
203-
(LoopCount->isInstantiationDependent() || isa<ConstantExpr>(LoopCount)) &&
204-
"Loop count not constant expression");
199+
assert((!LoopCount || (LoopCount->isInstantiationDependent() ||
200+
isa<ConstantExpr>(LoopCount))) &&
201+
"Loop count not constant expression");
205202
void *Mem =
206203
C.Allocate(sizeof(OpenACCCollapseClause), alignof(OpenACCCollapseClause));
207204
return new (Mem)

clang/lib/AST/TextNodeDumper.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -433,6 +433,7 @@ void TextNodeDumper::Visit(const OpenACCClause *C) {
433433
case OpenACCClauseKind::Vector:
434434
case OpenACCClauseKind::VectorLength:
435435
case OpenACCClauseKind::Invalid:
436+
case OpenACCClauseKind::Shortloop:
436437
// The condition expression will be printed as a part of the 'children',
437438
// but print 'clause' here so it is clear what is happening from the dump.
438439
OS << " clause";

clang/lib/Parse/ParseOpenACC.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -537,6 +537,8 @@ ClauseParensKind getClauseParensKind(OpenACCDirectiveKind DirKind,
537537
case OpenACCClauseKind::Tile:
538538
return ClauseParensKind::Required;
539539

540+
case OpenACCClauseKind::Shortloop:
541+
llvm_unreachable("Shortloop shouldn't be generated in clang");
540542
case OpenACCClauseKind::Auto:
541543
case OpenACCClauseKind::Finalize:
542544
case OpenACCClauseKind::IfPresent:

clang/lib/Sema/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ add_clang_library(clangSema
7575
SemaOpenACC.cpp
7676
SemaOpenACCAtomic.cpp
7777
SemaOpenACCClause.cpp
78+
SemaOpenACCClauseAppertainment.cpp
7879
SemaOpenCL.cpp
7980
SemaOpenMP.cpp
8081
SemaOverload.cpp

0 commit comments

Comments
 (0)