-
Notifications
You must be signed in to change notification settings - Fork 788
[Driver][SYCL] Improve integration header usage when performing offload #3471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When generating the integration header, we were performing an additional compilation step which is not needed, as we can create the output file and the integration header at the same time. Improve this behavior by tracking the integration files based on given input files, creating and using the files accordingly when needed during the compilation flow.
@mdtoguchi, great improvement! In nutshell, @Naghasan suggests we consider using a host compiler to emit the data from the integration header. OpenMP-offload uses such approach for binding offloaded code with the host application. I assume CUDA and HIP compilers are doing something similar. I was going to check if we can apply this approach to SYCL and gather all pros and cons for both approaches. Tagging @erichkeane and @AaronBallman. |
@@ -5113,6 +5107,21 @@ void Driver::BuildActions(Compilation &C, DerivedArgList &Args, | |||
|
|||
handleArguments(C, Args, Inputs, Actions); | |||
|
|||
// When compiling for -fsycl, generate the integration header files that | |||
// will be used during the compilation. | |||
if (Args.hasFlag(options::OPT_fsycl, options::OPT_fno_sycl, false)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be also working when using -fsycl-device-only
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-fsycl-device-only
doesn't perform the host compilation, so the generated integration header would not be used. Unless there is a use model for it, I don't think it's necessary.
I don't think we can do it in host mode for 2 reasons: |
I created #3474 to discuss the approach for upstreaming to llorg. |
Well if they are missing during the host compilation, I don't see how you can invoke them.
The design permits this but it doesn't mandate it, this is an implementation design choice not a spec requirement. The 1.2.1 use to describe a single pass compiler as an alternative to the multi-pass approach BTW. |
The whole purpose of the integration header is to provide information to the host compiler that is only available at device compilation though. As far as a single-pass compiler, great! But we don't have one. As far as a separate host compiler, our implementation has that requirement. |
stdcpp_compat.cpp was malformed with the options used layout_accessors_host.cpp was expecting output from 3 compilations, which is now down to 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree with @erichkeane on the subject of host compiler-based strategy - emitting the header during the device FE compilation is the only viable approach if we're to adhere to the "custom host compiler" feature requirement. Not to mention that the chosen approach is a simpler one. In a nutshell, I believe that if there is no strong community pushback against device-side emission of the integration header, we should stick to what this PR currently provides.
Overall, the improving changes look good to me - much cleaner and far less hacky than I personally expected from the filename-mapping approach. I'll need to spend some more time on reading the code itself, hitting "Approve" right after that iteration.
P. S. Coming back from a vacation and seeing an important solution like this one adds up to the positive review experience :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation LGTM!
@vladimirlaz, @intel/llvm-reviewers-runtime, ping. |
When generating the integration header, we were performing an additional
compilation step which is not needed, as we can create the output file
and the integration header at the same time.
Improve this behavior by tracking the integration files based on given
input files, creating and using the files accordingly when needed during
the compilation flow.