[ROCm][Quantization] extend AMD Quark to support mixed-precision quantized model #24239

xuebwang-amd · 2025-09-04T09:37:23Z

Purpose

This PR aims to support layerwise mixed-precision quantization model inference, extending from quantized models in single scheme such as MXFP4, FP8 (aka PTQ models).

Here, the layerwise mixed-precision configuration for a given model is searched and then quantized by amd-quark. Specifically, in this PR, we focus on mixed scheme of {MXFP4, FP8}. FP8 here denotes for FP8 per-tensor scheme.

With the mixed-precision quantized model, one could achieve an optimal balance between accuracy and hardware metrics.
To demonstrate the benefits of mixed-precision model in the PR, we show the model accuracies on several commonly used tasks only using Quark emulation kernel for MXFP4 and triton kernel for FP8.

Test Plan

Test on

Test Result

List of TODO items

Layerwise mixed-precision quantization scheme of {MXFP4, FP8} (exactly this PR aims for)
extend model coverages
benchmark hardware metrics
further support Unquantized Linear and/or MoE layer(s) into mixed-precision scheme, i.e., {MXFP4, FP8, BF16/FP16}
further support MXFP6 scheme for mixed-precision quantization, i.e., {MXFP4, MXFP6, FP8, BF16/FP16}

Signed-off-by: xuebwang-amd <[email protected]>

…latform supportness Signed-off-by: xuebwang-amd <[email protected]>

Signed-off-by: xuebwang-amd <[email protected]>

gemini-code-assist

Code Review

This pull request extends Quark to support mixed-precision models, specifically for {MXFP4, FP8} schemes. The changes involve updating quantization configuration logic to handle mixed-precision setups and adding new tests to validate model accuracies. My review identified two high-severity issues. First, in the new test file, environment variables are not handled safely, which could lead to test state leakage. I've recommended using pytest.monkeypatch for robust cleanup. Second, in the Quark configuration logic, a fragile substring check is used for matching layer names, which could result in applying incorrect quantization schemes. I've suggested a more robust pattern matching approach to ensure correctness. Addressing these issues will improve the reliability and correctness of the new mixed-precision quantization feature.

gemini-code-assist · 2025-09-04T09:39:50Z

tests/quantization/test_mixed_precision.py

+def test_mixed_precision_model_accuracies(config: EvaluationConfig, task: str):
+    os.environ["VLLM_QUARK_EMU_MEM_OPT"] = "1"
+
+    results = lm_eval.simple_evaluate(model="vllm",
+                                      model_args=config.get_model_args(),
+                                      tasks=task,
+                                      batch_size="auto")
+
+    rtol = 0.05
+
+    EXPECTED_VALUE = config.excepted_value
+    measured_value = results["results"][task]["acc,none"]
+    assert (measured_value - rtol < EXPECTED_VALUE
+            and measured_value + rtol > EXPECTED_VALUE
+            ), f"Expected: {EXPECTED_VALUE} |  Measured: {measured_value}"
+
+    del os.environ["VLLM_QUARK_EMU_MEM_OPT"]


Setting and deleting an environment variable directly using os.environ can lead to state leakage between tests if an exception occurs before the del statement. This can cause subsequent tests to fail or behave unexpectedly. It's safer to use pytest's monkeypatch fixture, which is already used in this file, to manage environment variables. monkeypatch ensures that the environment is restored to its original state after the test function completes, regardless of whether it passes or fails.

Suggested change

def test_mixed_precision_model_accuracies(config: EvaluationConfig, task: str):

os.environ["VLLM_QUARK_EMU_MEM_OPT"] = "1"

results = lm_eval.simple_evaluate(model="vllm",

model_args=config.get_model_args(),

tasks=task,

batch_size="auto")

rtol = 0.05

EXPECTED_VALUE = config.excepted_value

measured_value = results["results"][task]["acc,none"]

assert (measured_value - rtol < EXPECTED_VALUE

and measured_value + rtol > EXPECTED_VALUE

), f"Expected: {EXPECTED_VALUE} | Measured: {measured_value}"

del os.environ["VLLM_QUARK_EMU_MEM_OPT"]

def test_mixed_precision_model_accuracies(config: EvaluationConfig, task: str, monkeypatch):

monkeypatch.setenv("VLLM_QUARK_EMU_MEM_OPT", "1")

results = lm_eval.simple_evaluate(model="vllm",

model_args=config.get_model_args(),

tasks=task,

batch_size="auto")

rtol = 0.05

EXPECTED_VALUE = config.excepted_value

measured_value = results["results"][task]["acc,none"]

assert (measured_value - rtol < EXPECTED_VALUE

and measured_value + rtol > EXPECTED_VALUE

), f"Expected: {EXPECTED_VALUE} | Measured: {measured_value}"

Updated, along with several refactoring.

gemini-code-assist · 2025-09-04T09:39:50Z

vllm/model_executor/layers/quantization/quark/quark.py

+            layer_quant_configs = list()
            for name_pattern in layer_quant_config:
-                if fnmatch.fnmatch(layer_name, name_pattern):
-                    return layer_quant_config[name_pattern]
+                if layer_name in name_pattern:
+                    layer_quant_configs.append(
+                        layer_quant_config[name_pattern])
+                    return layer_quant_configs[0]


The matching logic if layer_name in name_pattern: is likely incorrect and can lead to bugs. It performs a substring check, which can cause unintended matches (e.g., attention.q_proj would incorrectly match a pattern for cross_attention.q_proj). This is likely not the intended behavior for matching layer configurations and could lead to applying the wrong quantization scheme to a layer.

Given that fnmatch is used elsewhere in this file for pattern matching, it seems the intention is to support glob patterns. If name_pattern can be a comma-separated list of patterns, the logic should be updated to split the string and apply fnmatch to each part. This ensures accurate matching of layer configurations and prevents applying the wrong quantization scheme.

The current implementation also unnecessarily creates a list layer_quant_configs to immediately return its first element. This can be simplified by returning directly upon finding a match.

Suggested change

layer_quant_configs = list()

for name_pattern in layer_quant_config:

if fnmatch.fnmatch(layer_name, name_pattern):

return layer_quant_config[name_pattern]

if layer_name in name_pattern:

layer_quant_configs.append(

layer_quant_config[name_pattern])

return layer_quant_configs[0]

for name_pattern in layer_quant_config:

patterns = [p.strip() for p in name_pattern.split(',')]

for p in patterns:

if fnmatch.fnmatch(layer_name, p):

return layer_quant_config[name_pattern]

This code snippet suggest from gemini-code-assist is problematic. Because for name_pattern, it looks like model.layers.0.block_sparse_moe.experts.0.w1 as an example. So name_pattern.split(',') doesn't make sense and subsequent fnmatch.fnmatch is also irrelevant.

Signed-off-by: xuebwang-amd <[email protected]>

BowenBao

Thanks, great start!

BowenBao · 2025-09-04T21:41:56Z

vllm/model_executor/layers/quantization/quark/quark.py

                dict[str, Any], self.quant_config.get("layer_quant_config"))
+            layer_quant_configs = list()
            for name_pattern in layer_quant_config:
-                if fnmatch.fnmatch(layer_name, name_pattern):


Is this change necessary? Also layer_quant_configs seem unused: appends the first matched config and immediately returns it.

Update as also suggested #24239 (comment)

BowenBao · 2025-09-04T21:46:26Z

vllm/model_executor/layers/fused_moe/utils.py

 ) -> tuple[torch.Tensor, None]:
    assert block_shape is None
-    if not current_platform.supports_mx():
+    VLLM_QUARK_EMU_MEM_OPT = (os.environ.get("VLLM_QUARK_EMU_MEM_OPT",


In general for env flags it is better to add to vllm/vllm/envs.py with comments on its effect.

Can you keep this change local? In particular we want to move away from simulation to triton kernels as we move forward. cc @fxmarty-amd

Totally agree on that.
The reason why VLLM_QUARK_EMU_MEM_OPT is not added into vllm/vllm/envs.py is because it's better to make it as a local and temporal environment variable, just for make things work at this moment. After non-emulation kernels such as triton or aiter implementations are integrated, we can totally remove it.

@xuebwang-amd this variable that I added previously has been removed as per @mgoin request in order to avoid adding new a new unnecessary env variable to vllm, especially given that we have a decently fast mxfp4 dequantization kernel.

Please avoid adding this environment variable, keep it local for testing if needed.

I appreciate your previous effort about this emulation approach, it played a role more than local test. The functionality goes on like what I'm doing here.
Actually, it indeed goes to the mx.qdq_mxfp4 defined in the https://github.com/vllm-project/vllm/blob/8de261b04a0a0e916d3d25d528d0f2ddeede2a6b/vllm/model_executor/layers/quantization/utils/mxfp4_utils.py#L94C5-L94C25 with enable the VLLM_QUARK_EMU_MEM_OPT=1.

The real motivation of this environment variable is to let flow go to the emulation flow regardless of platform support of MX because the non-emulation kernels haven't been implemented into the flow.

Therefore, the solution here is to remove the if-else statement:
if not current_platform.supports_mx(): A = quant_dequant_mxfp4(A) else: raise NotImplementedError()
and let it to be always A = quant_dequant_mxfp4(A).

BowenBao · 2025-09-04T21:49:44Z

vllm/model_executor/layers/quantization/quark/quark.py

            layer_quant_set = set(layer_quant_names)

-            if not kv_cache_set.issubset(layer_quant_set):
+            if not (kv_cache_set.issubset(layer_quant_set) or \


Could you explain what is goal for these changes around kv cache?

For AMP models, are kv-caches still uniformly quantized the same way across all layers?

Yes, currently mixed precision is not applied on the KV cache dimension across all KV layers.
Changes here aim to correctly verify if the kv cache pattern such as {'*v_proj', '*k_proj'} can match, in other words, can be found in at least one layer_quant_set keys (i.e., layer names).
This is essential when going to AMP scenarios that layer_quant_names are specified one by one, rather than concentrating in a fuzzy matching way.

Signed-off-by: xuebwang-amd <[email protected]>

docs/features/quantization/quark.md

fxmarty-amd · 2025-09-11T08:44:22Z

tests/quantization/test_mixed_precision.py

+@pytest.fixture(scope="function", autouse=True)
+def use_v0_only(monkeypatch):
+    """
+    This module relies on V0 internals, so set VLLM_USE_V1=0.
+    """
+    monkeypatch.setenv('VLLM_USE_V1', '0')


Let's avoid using v0

For test purpose, especially for accuracy test, using V0 is safe. Even for hardware metric test later, using V0 is still safer while valuable for demonstrations.

vllm v0 is deprecated: #18571

V1 is reported to be having issues as you can see. Since mixed-precision quantization is not dependent on V0/V1 engine, it's safe to use V0.

use_v0_only had been removed as the V0 backend is deprecated #25351 very recently. Thanks @fxmarty-amd

fxmarty-amd · 2025-09-11T08:44:40Z

tests/quantization/test_mixed_precision.py

+try:
+    huggingface_hub.list_repo_refs(
+        "amd/Llama-2-70b-chat-hf-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8")
+    HF_HUB_AMD_ORG_ACCESS = True
+except huggingface_hub.errors.RepositoryNotFoundError:
+    HF_HUB_AMD_ORG_ACCESS = False


Let's use public models

These models are under progress for publish.

Do you have an ETA for when we can expect these models to be published?

AMD's colleagues are speeding up the progress, hopefully they can make it happen some time next week.

@xuebwang-amd I meant that for unit testing you can probably use small models just for integration test purposes (as e.g. in

vllm/tests/kernels/moe/test_mxfp4_moe.py

Lines 51 to 55 in 58c360d

@pytest.mark.parametrize('model_case', [

ModelCase("fxmarty/qwen_1.5-moe-a2.7b-mxfp4", tp=1),

ModelCase("fxmarty/deepseek_r1_3_layers_mxfp4", tp=8),

ModelCase("fxmarty/Llama-4-Scout-17B-16E-Instruct-2-layers-mxfp4", tp=1)

])

) - but having private models is okay for a while I guess.

@fxmarty-amd your motivation here is to reduce the CI time cost, that's good. We can consider pick up one public model into the CI test. @gshtras @SageMoore

Do you have an ETA for when we can expect these models to be published?

Eventually they are published.

fxmarty-amd · 2025-09-11T08:45:23Z

tests/quantization/test_mixed_precision.py

+    reason="Read access to huggingface.co/amd is required for this test.")
+def test_mixed_precision_model_accuracies(model_name: str,
+                                          accuracy_numbers: dict, monkeypatch):
+    monkeypatch.setenv("VLLM_QUARK_EMU_MEM_OPT", "1")


This environment variable has no effect - it has been removed from vllm.

Then we need to remove the if-else statement in the _mxfp4_quantize, as commented in above #24239 (comment)

fxmarty-amd · 2025-09-11T08:47:37Z

vllm/model_executor/layers/fused_moe/utils.py

 ) -> tuple[torch.Tensor, None]:
    assert block_shape is None
-    if not current_platform.supports_mx():
+    VLLM_QUARK_EMU_MEM_OPT = (os.environ.get("VLLM_QUARK_EMU_MEM_OPT",


@xuebwang-amd this variable that I added previously has been removed as per @mgoin request in order to avoid adding new a new unnecessary env variable to vllm, especially given that we have a decently fast mxfp4 dequantization kernel.

Please avoid adding this environment variable, keep it local for testing if needed.

docs/features/quantization/quark.md

fxmarty-amd · 2025-09-11T09:03:03Z

docs/features/quantization/quark.md

+As examples, we provide some ready-to-use quantized mixed precision model to show the usage in vLLM and the accuracy benifits. They are:
+
+- amd/Llama-2-70b-chat-hf-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8
+- amd/Mixtral-8x7B-Instruct-v0.1-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8
+- amd/Qwen3-8B-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8


Make these public + add link

They're going to be published.

fxmarty-amd · 2025-09-11T09:05:24Z

Test Plan

Test on

1. [amd/Llama-2-70b-chat-hf-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8](https://huggingface.co/amd/Llama-2-70b-chat-hf-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8)

2. [amd/Mixtral-8x7B-Instruct-v0.1-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8](https://huggingface.co/amd/Mixtral-8x7B-Instruct-v0.1-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8)

3. [amd/Qwen3-8B-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8](https://huggingface.co/amd/Qwen3-8B-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8)

Can you provide:

Which proportion of layers are in FP8/MXFP4
Comparison against MXFP4 alone?

…ecision

Co-authored-by: fxmarty-amd <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

xuebwang-amd · 2025-09-15T09:17:01Z

Test Plan

Test on

1. [amd/Llama-2-70b-chat-hf-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8](https://huggingface.co/amd/Llama-2-70b-chat-hf-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8)

2. [amd/Mixtral-8x7B-Instruct-v0.1-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8](https://huggingface.co/amd/Mixtral-8x7B-Instruct-v0.1-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8)

3. [amd/Qwen3-8B-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8](https://huggingface.co/amd/Qwen3-8B-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8)

Can you provide:

Which proportion of layers are in FP8/MXFP4
Comparison against MXFP4 alone?

One can check the detailed layerwise MXFP8/FP8 configuration in the config.json, specifically the key quantization_config:layer_quant_config.
Not only plain MXFP4, but also plain FP8, the accuracies and hardware metrics are measured. Updates are on-going.
Note here, these numbers are and will not be guaranteed as final and optimized values. They're for demonstration purpose, and could be further improved.

Signed-off-by: xuebwang-amd <[email protected]>

mergify · 2025-10-10T09:50:18Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @xuebwang-amd.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…uark_layerwise_mixed_precision

xuebwang-amd · 2025-10-10T14:35:34Z

This pull request has merge conflicts that must be resolved before it can be merged. Please rebase the PR, @xuebwang-amd.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

I have make it reverted back to a good commit 575caf2, so no conflictions anymore.

…uark_layerwise_mixed_precision

xuebwang-amd added 3 commits September 4, 2025 09:29

extend quark to support mixed-precision quantization model

d45f2be

Signed-off-by: xuebwang-amd <[email protected]>

use an environment variable to support mxfp4 quantize regardless of p…

d984821

…latform supportness Signed-off-by: xuebwang-amd <[email protected]>

add a test for quark mixed precision models

8976bbb

Signed-off-by: xuebwang-amd <[email protected]>

xuebwang-amd requested review from mgoin, robertgshaw2-redhat, tlrmchlsmth and yewentao256 as code owners September 4, 2025 09:37

gemini-code-assist bot reviewed Sep 4, 2025

View reviewed changes

xuebwang-amd added 2 commits September 4, 2025 10:30

fix pre-commit issues

4ed7a68

Signed-off-by: xuebwang-amd <[email protected]>

add one section about mixed-precision usage in the Quark document

b217e06

Signed-off-by: xuebwang-amd <[email protected]>

xuebwang-amd requested a review from hmellor as a code owner September 4, 2025 12:19

mergify bot added the documentation Improvements or additions to documentation label Sep 4, 2025

BowenBao reviewed Sep 4, 2025

View reviewed changes

xuebwang-amd added 2 commits September 5, 2025 10:00

tiny update AMP document

5f4b012

Signed-off-by: xuebwang-amd <[email protected]>

refactor test script and add a new model

24a0203

Signed-off-by: xuebwang-amd <[email protected]>

xuebwang-amd mentioned this pull request Sep 5, 2025

[Feature][Quantization] Support Quark for mixed-precision quantized model #24040

Closed

4 tasks

xuebwang-amd added 2 commits September 5, 2025 12:21

simplify layer_quant_configs matching

55742f9

Signed-off-by: xuebwang-amd <[email protected]>

update AMP section in the Quark document

4ba49d3

Signed-off-by: xuebwang-amd <[email protected]>

fxmarty-amd reviewed Sep 11, 2025

View reviewed changes

xuebwang-amd and others added 8 commits September 15, 2025 07:55

Merge branch 'main' into xuebin/upstream_amd_quark_layerwise_mixed_pr…

f5dfdd2

…ecision

Update docs/features/quantization/quark.md

8fcb1ad

Co-authored-by: fxmarty-amd <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Update docs/features/quantization/quark.md

d58e162

Co-authored-by: fxmarty-amd <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Update docs/features/quantization/quark.md

7072b0a

Co-authored-by: fxmarty-amd <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Update docs/features/quantization/quark.md

755d214

Co-authored-by: fxmarty-amd <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Update docs/features/quantization/quark.md

3a59917

Co-authored-by: fxmarty-amd <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Update docs/features/quantization/quark.md

b6aea4e

Co-authored-by: fxmarty-amd <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Update docs/features/quantization/quark.md

356e03c

Co-authored-by: fxmarty-amd <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

remove VLLM_QUARK_EMU_MEM_OPT and use_v0

89243f3

Signed-off-by: xuebwang-amd <[email protected]>

mergify bot added ci/build deepseek Related to DeepSeek models llama Related to Llama models multi-modality Related to multi-modality (#4194) new-model Requests to new models performance Performance-related issues qwen Related to Qwen models structured-output labels Oct 10, 2025

github-project-automation bot added this to Structured Output Oct 10, 2025

mergify bot added speculative-decoding v1 tpu Related to Google TPUs tool-calling labels Oct 10, 2025

github-project-automation bot added this to Tool Calling Oct 10, 2025

mergify bot assigned sangstar Oct 10, 2025

mergify bot added the kv-connector label Oct 10, 2025

mergify bot added needs-rebase and removed needs-rebase labels Oct 10, 2025

xuebwang-amd force-pushed the xuebin/upstream_amd_quark_layerwise_mixed_precision branch from 3f92445 to 575caf2 Compare October 10, 2025 10:52

mergify bot removed the tpu Related to Google TPUs label Oct 10, 2025

Merge remote-tracking branch 'origin/main' into xuebin/upstream_amd_q…

bd59e51

…uark_layerwise_mixed_precision

xuebwang-amd added 7 commits October 13, 2025 04:06

Resolved merge conflicts

9ef2b8f

Merge remote-tracking branch 'origin/main' into xuebin/upstream_amd_q…

1c2c4d5

…uark_layerwise_mixed_precision

Merge remote-tracking branch 'origin/main' into xuebin/upstream_amd_q…

20b23dc

…uark_layerwise_mixed_precision

Merge remote-tracking branch 'origin/main' into xuebin/upstream_amd_q…

9256a8e

…uark_layerwise_mixed_precision

Merge remote-tracking branch 'origin/main' into xuebin/upstream_amd_q…

6f8294c

…uark_layerwise_mixed_precision

Merge remote-tracking branch 'origin/main' into xuebin/upstream_amd_q…

8c719b3

…uark_layerwise_mixed_precision

Merge remote-tracking branch 'origin/main' into xuebin/upstream_amd_q…

3fac262

…uark_layerwise_mixed_precision

	@pytest.mark.parametrize('model_case', [
	ModelCase("fxmarty/qwen_1.5-moe-a2.7b-mxfp4", tp=1),
	ModelCase("fxmarty/deepseek_r1_3_layers_mxfp4", tp=8),
	ModelCase("fxmarty/Llama-4-Scout-17B-16E-Instruct-2-layers-mxfp4", tp=1)
	])

Uh oh!

[ROCm][Quantization] extend AMD Quark to support mixed-precision quantized model #24239

Are you sure you want to change the base?

[ROCm][Quantization] extend AMD Quark to support mixed-precision quantized model #24239

Conversation

xuebwang-amd commented Sep 4, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

List of TODO items

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

xuebwang-amd Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

xuebwang-amd Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BowenBao left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xuebwang-amd commented Sep 4, 2025 •

edited by github-actions bot

Loading

xuebwang-amd Sep 5, 2025 •

edited

Loading

xuebwang-amd Sep 5, 2025 •

edited

Loading

xuebwang-amd commented Sep 15, 2025 •

edited

Loading