keti9 #5090

iosmers · 2025-11-17T10:47:09Z

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

CLAassistant · 2025-11-17T10:47:16Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

root seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

paddle-bot · 2025-11-17T10:47:18Z

Thanks for your contribution!

Copilot

Pull Request Overview

This PR appears to be a work-in-progress or development/testing branch (titled "keti9") that introduces conditional logic for selecting between two MOE expert dispatch implementations. The PR lacks proper documentation and contains debugging artifacts.

Key Changes:

Adds a new f_ep_moe_expert_dispatch C++ custom operator implementation
Introduces global state management via setting.py module to control CUDA graph usage
Adds conditional dispatch logic in the XPU MOE layer based on use_cuda_graph flag
Includes test files and debug logging

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
custom_ops/xpu_ops/src/ops/f_moe_ep_dispatch.cc	New C++ operator implementing MOE expert dispatch without CUDA graph support
fastdeploy/model_executor/layers/backends/xpu/moe/fused_moe.py	Adds conditional logic to choose between dispatch implementations with debug prints
fastdeploy/config.py	Imports and sets global `setting.use_cuda_graph` state with debug prints
fastdeploy/input/ernie4_5_processor.py	Adds debug logging for request processing
setting.py	Global module storing `use_cuda_graph` configuration
fastdeploy/cuda_graph_config.py	Duplicate configuration file for CUDA graph setting
test_1_query.py	Test script for API server queries
scripts/build_and_run.sh	Build and run script for XPU testing

Copilot · 2025-11-17T10:50:04Z

fastdeploy/model_executor/layers/backends/xpu/moe/fused_moe.py

    free_tensor,
    set_weight_attrs,
 )
+import setting


Importing a top-level setting module creates a problematic global state dependency and breaks modularity. This makes the code difficult to test and creates hidden dependencies. Consider passing the use_cudagraph configuration through the existing config system or as a parameter to the class/method instead of relying on a global import.

Copilot · 2025-11-17T10:50:05Z

fastdeploy/config.py

+                print(f"key is {key}")
+                print(f"value is {value}")


Debug print statements should be removed before merging. Use the logging framework (e.g., logger.debug()) instead of print statements for production code.

Copilot · 2025-11-17T10:50:05Z

fastdeploy/config.py

+        import setting
+        print(f"setting.use_cuda_graph : {setting.use_cuda_graph}")
+        setting.use_cuda_graph = self.use_cudagraph
+        print(f"setting.use_cuda_graph : {setting.use_cuda_graph}")


This creates a tight coupling to a global setting module and contains debug print statements. The configuration should be managed through the class's existing configuration system rather than mutating global state. Remove debug prints and refactor to avoid global state mutation.

Copilot · 2025-11-17T10:50:05Z

fastdeploy/model_executor/layers/backends/xpu/moe/fused_moe.py

-            self.moe_quant_type,
-        )
+        if setting.use_cuda_graph:
+            print(f"调用ep_moe_expert_dispatch")


Debug print statement in Chinese should be removed before merging. If this information is needed for debugging, use the logging framework with an appropriate log level (e.g., logger.debug()).

Copilot · 2025-11-17T10:50:06Z

fastdeploy/model_executor/layers/backends/xpu/moe/fused_moe.py

+                self.moe_quant_type,
+            )
+        else:
+            print(f"调用f_ep_moe_expert_dispatch")


Debug print statement in Chinese should be removed before merging. If this information is needed for debugging, use the logging framework with an appropriate log level (e.g., logger.debug()).

Copilot · 2025-11-17T10:50:07Z

test_1_query.py

@@ -0,0 +1,76 @@
+import json


This test file should not be in the repository root. Test files should be organized in the tests/ directory with proper naming conventions. If this is a temporary development file, it should be removed before merging.

Copilot · 2025-11-17T10:50:07Z

scripts/build_and_run.sh

+# fi
+
+# export XPUAPI_DEBUG=0x1
+export PYTHONPATH=/opt/output/work_dir/ssd2/yangshuang/work/yinwei/FD_FOR_KETI9


Hardcoded absolute path specific to a developer's environment should not be committed. This path is machine-specific and will break on other systems. Use relative paths or environment variables that users can configure.

Copilot · 2025-11-17T10:50:07Z

custom_ops/xpu_ops/src/ops/f_moe_ep_dispatch.cc

+  const int topk = topk_ids_shape[topk_ids_shape.size() - 1];
+  const int expert_num = token_nums_per_expert.size();
+  return {{token_nums_this_rank, hidden_size},
+          {expert_num, m},


The output shape {expert_num, m} appears incorrect. Comparing to the original moe_ep_dispatch.cc, line 174 shows the correct shape should be {m, topk} for permute_indices_per_token. The dimensions are swapped, which will cause shape mismatch errors.

Suggested change

{expert_num, m},

{m, topk},

Copilot · 2025-11-17T10:50:08Z

fastdeploy/input/ernie4_5_processor.py

            request["enable_thinking"] = True

-        data_processor_logger.info(f"Processed request dict: {request}")
+        data_processor_logger.info(f"Processed request dict: {request}, len(request.get('prompt_token_ids')): {len(request.get('prompt_token_ids'))}")


This long log line should be split into multiple lines or simplified. Logging the entire request dict may expose sensitive information and create excessive log output. Consider logging only essential fields or using debug level for verbose output.

Copilot · 2025-11-17T10:50:08Z

test_1_query.py

@@ -0,0 +1,76 @@
+import json


Import of 'json' is not used.

Suggested change

import json

update

e429aae

Copilot AI review requested due to automatic review settings November 17, 2025 10:47

Copilot started reviewing on behalf of iosmers November 17, 2025 10:47 View session

Copilot finished reviewing on behalf of iosmers November 17, 2025 10:49

Copilot AI reviewed Nov 17, 2025

View reviewed changes

keti9 #5090

Are you sure you want to change the base?

keti9 #5090

Uh oh!

Conversation

iosmers commented Nov 17, 2025

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

CLAassistant commented Nov 17, 2025

Uh oh!

paddle-bot bot commented Nov 17, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants