Skip to content

[SYCL][HIP] Support amd-gpu-gfx1034 as an acceptable value for -fsycl-targets #8106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 3, 2023

Conversation

mmoadeli
Copy link
Contributor

@mmoadeli mmoadeli commented Jan 25, 2023

  • AMD GFX1034 is ROCm supported architecture and as such amd-gpu-gfx1034 is added as an acceptable value for -fsycl-targets.
  • Remove Added in version for supported architectures.

@mmoadeli mmoadeli requested review from a team as code owners January 25, 2023 22:57
@mmoadeli mmoadeli linked an issue Jan 25, 2023 that may be closed by this pull request
@mmoadeli mmoadeli temporarily deployed to aws January 25, 2023 23:23 — with GitHub Actions Inactive
@mmoadeli mmoadeli temporarily deployed to aws January 25, 2023 23:54 — with GitHub Actions Inactive
@zjin-lcf
Copy link
Contributor

I have a question about RDNA3. Need gfx1100, 1101, 1102, 1103 be added ?

@mmoadeli
Copy link
Contributor Author

I have a question about RDNA3. Need gfx1100, 1101, 1102, 1103 be added ?

Only ROCm supported AMD GPU architectures are supported. Unfortunately, I could not find an up to date list of the ROCm supported AMD GPU architectures. If they can be verified to be ROCm supported, they will be added.

@keryell
Copy link
Contributor

keryell commented Jan 26, 2023

I have a question about RDNA3. Need gfx1100, 1101, 1102, 1103 be added ?

Only ROCm supported AMD GPU architectures are supported. Unfortunately, I could not find an up to date list of the ROCm supported AMD GPU architectures. If they can be verified to be ROCm supported, they will be added.

This is why I would just pass down the stream the AMD GPU architecture without doing any validation here.
At most add a warning or add an option to skip the enforcement?
At the end it might just work™ enough for some use cases or be useful for experimenting.

@keryell
Copy link
Contributor

keryell commented Jan 26, 2023

Looking at the existing list in the code, there are super old AMD GPU which are not or at least no longer supported by ROCm for ages anyway (gfx700...).

@mmoadeli
Copy link
Contributor Author

Looking at the existing list in the code, there are super old AMD GPU which are not or at least no longer supported by ROCm for ages anyway (gfx700...).

The list is prepared based on the information at https://llvm.org/docs/AMDGPUUsage.html.

@keryell
Copy link
Contributor

keryell commented Jan 26, 2023

The list is prepared based on the information at llvm.org/docs/AMDGPUUsage.html.

That makes sense.
Note there are also even older but also newer devices in this list, like the ones mentioned by @zjin-lcf.

@zjin-lcf
Copy link
Contributor

@mmoadeli

Browsing your changes let me know about adding new devices. So, I find codes related to gfx11xx devices in the AMD LLVM repo (https://github.com/RadeonOpenCompute/llvm-project)

Basic/Cuda.cpp: GFX(1100), // gfx1100
Basic/Cuda.cpp: GFX(1101), // gfx1101
Basic/Cuda.cpp: GFX(1102), // gfx1102
Basic/Cuda.cpp: GFX(1103), // gfx1103
Basic/Targets/AMDGPU.cpp: Features["gfx11-insts"] = true;

@mmoadeli
Copy link
Contributor Author

mmoadeli commented Jan 26, 2023

@zjin-lcf
Do you have any of the suggested devices?
We should be able to add them, if we are able to build samples on the device using the command below (XXXX replacing the valid number).
clang++ -fsycl -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend --offload-arch=gfxXXXX

@zjin-lcf
Copy link
Contributor

@mmoadeli

I don't have gfx11xx devices for evaluation. I thought that users would not file a bug report similar to the one you are solving when they are evaluating SYCL with any of the four gfx11xx devices.

@al42and
Copy link
Contributor

al42and commented Jan 26, 2023

This is why I would just pass down the stream the AMD GPU architecture without doing any validation here.
At most add a warning or add an option to skip the enforcement?

I also think that just letting users specify whatever they want is a better approach.

Maybe having extensive validation for "convenience" syntax -fsycl-targets=amd_gpu_gfx1034 is okay. But I don't think it makes sense to deliberately limit what the user can pass as backend options through the full -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend --offload-arch=gfxXXXX syntax. As far as I can tell, there's already (some) support for the upcoming gfx940 in LLVM. I know some people were using LLVM on 90c APUs. Currently, trying to compile for them fails due to a broken internal Clang command line (#8016).

Opening an issue, waiting for a fix, bumping the feature test macro, all the while preventing the people from testing the code on the device, I think, is a maintenance burden and not very user-friendly.

@mmoadeli
Copy link
Contributor Author

This is why I would just pass down the stream the AMD GPU architecture without doing any validation here.
At most add a warning or add an option to skip the enforcement?

I also think that just letting users specify whatever they want is a better approach.

Maybe having extensive validation for "convenience" syntax -fsycl-targets=amd_gpu_gfx1034 is okay. But I don't think it makes sense to deliberately limit what the user can pass as backend options through the full -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend --offload-arch=gfxXXXX syntax. As far as I can tell, there's already (some) support for the upcoming gfx940 in LLVM. I know some people were using LLVM on 90c APUs. Currently, trying to compile for them fails due to a broken internal Clang command line (#8016).

Opening an issue, waiting for a fix, bumping the feature test macro, all the while preventing the people from testing the code on the device, I think, is a maintenance burden and not very user-friendly.

@al42and thanksI agree.
Would you mind creating an issue and assign it to me please?

@al42and
Copy link
Contributor

al42and commented Jan 26, 2023

@mmoadeli, done. See #8112.

@al42and
Copy link
Contributor

al42and commented Jan 26, 2023

That said, this patch fixes the problem with gfx1034 specifically just fine; can confirm that it works.

…he wording in the "experimental extension"

- Remov `Added in version` for supported architectures.
@mmoadeli mmoadeli temporarily deployed to aws February 1, 2023 23:29 — with GitHub Actions Inactive
@mmoadeli mmoadeli temporarily deployed to aws February 2, 2023 00:20 — with GitHub Actions Inactive
- Adds experimental table instead of version one.
@mmoadeli mmoadeli temporarily deployed to aws February 2, 2023 15:03 — with GitHub Actions Inactive
@mmoadeli mmoadeli temporarily deployed to aws February 2, 2023 15:42 — with GitHub Actions Inactive
Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec and doc changes LGTM.

@bader bader changed the title Support amd-gpu-gfx1034 as an acceptable value for -fsycl-targets [SYCL] Support amd-gpu-gfx1034 as an acceptable value for -fsycl-targets Feb 2, 2023
@mmoadeli mmoadeli changed the title [SYCL] Support amd-gpu-gfx1034 as an acceptable value for -fsycl-targets [SYCL][HIP] Support amd-gpu-gfx1034 as an acceptable value for -fsycl-targets Feb 2, 2023
@mmoadeli mmoadeli temporarily deployed to aws February 3, 2023 15:36 — with GitHub Actions Inactive
@mmoadeli mmoadeli temporarily deployed to aws February 3, 2023 16:28 — with GitHub Actions Inactive
@bader
Copy link
Contributor

bader commented Feb 3, 2023

Failed Tests (1):
SYCL :: KernelAndProgram/spec_constants_after_link.cpp

I hope this failure is not related to the patch. This test fails in pre-commit for other PRs as well.

@bader bader merged commit 5e86a41 into intel:sycl Feb 3, 2023
@zjin-lcf
Copy link
Contributor

zjin-lcf commented Feb 8, 2023

@bader Is a gfx1034 device needed for the test ?

@bader
Copy link
Contributor

bader commented Feb 8, 2023

This is the question for @mmoadeli.
AFAIK, AMD GPUs programs are not portable between architectures. So, specific HW is required if you test program execution. Compilation stage can be tested with cross-compilation.

@mmoadeli mmoadeli deleted the supprt-amd-gpu-gfx1034 branch July 7, 2023 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[HIP] ICE when compiling for gfx1034
8 participants