Skip to content

Conversation

@iosmers
Copy link
Collaborator

@iosmers iosmers commented Nov 17, 2025

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

Copilot AI review requested due to automatic review settings November 17, 2025 10:47
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


root seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@paddle-bot
Copy link

paddle-bot bot commented Nov 17, 2025

Thanks for your contribution!

Copilot finished reviewing on behalf of iosmers November 17, 2025 10:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR appears to be a work-in-progress or development/testing branch (titled "keti9") that introduces conditional logic for selecting between two MOE expert dispatch implementations. The PR lacks proper documentation and contains debugging artifacts.

Key Changes:

  • Adds a new f_ep_moe_expert_dispatch C++ custom operator implementation
  • Introduces global state management via setting.py module to control CUDA graph usage
  • Adds conditional dispatch logic in the XPU MOE layer based on use_cuda_graph flag
  • Includes test files and debug logging

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
custom_ops/xpu_ops/src/ops/f_moe_ep_dispatch.cc New C++ operator implementing MOE expert dispatch without CUDA graph support
fastdeploy/model_executor/layers/backends/xpu/moe/fused_moe.py Adds conditional logic to choose between dispatch implementations with debug prints
fastdeploy/config.py Imports and sets global setting.use_cuda_graph state with debug prints
fastdeploy/input/ernie4_5_processor.py Adds debug logging for request processing
setting.py Global module storing use_cuda_graph configuration
fastdeploy/cuda_graph_config.py Duplicate configuration file for CUDA graph setting
test_1_query.py Test script for API server queries
scripts/build_and_run.sh Build and run script for XPU testing

free_tensor,
set_weight_attrs,
)
import setting
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing a top-level setting module creates a problematic global state dependency and breaks modularity. This makes the code difficult to test and creates hidden dependencies. Consider passing the use_cudagraph configuration through the existing config system or as a parameter to the class/method instead of relying on a global import.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +853 to +854
print(f"key is {key}")
print(f"value is {value}")
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug print statements should be removed before merging. Use the logging framework (e.g., logger.debug()) instead of print statements for production code.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +859 to +862
import setting
print(f"setting.use_cuda_graph : {setting.use_cuda_graph}")
setting.use_cuda_graph = self.use_cudagraph
print(f"setting.use_cuda_graph : {setting.use_cuda_graph}")
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates a tight coupling to a global setting module and contains debug print statements. The configuration should be managed through the class's existing configuration system rather than mutating global state. Remove debug prints and refactor to avoid global state mutation.

Copilot generated this review using guidance from repository custom instructions.
self.moe_quant_type,
)
if setting.use_cuda_graph:
print(f"调用ep_moe_expert_dispatch")
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug print statement in Chinese should be removed before merging. If this information is needed for debugging, use the logging framework with an appropriate log level (e.g., logger.debug()).

Copilot generated this review using guidance from repository custom instructions.
self.moe_quant_type,
)
else:
print(f"调用f_ep_moe_expert_dispatch")
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug print statement in Chinese should be removed before merging. If this information is needed for debugging, use the logging framework with an appropriate log level (e.g., logger.debug()).

Copilot generated this review using guidance from repository custom instructions.
@@ -0,0 +1,76 @@
import json
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test file should not be in the repository root. Test files should be organized in the tests/ directory with proper naming conventions. If this is a temporary development file, it should be removed before merging.

Copilot generated this review using guidance from repository custom instructions.
# fi

# export XPUAPI_DEBUG=0x1
export PYTHONPATH=/opt/output/work_dir/ssd2/yangshuang/work/yinwei/FD_FOR_KETI9
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded absolute path specific to a developer's environment should not be committed. This path is machine-specific and will break on other systems. Use relative paths or environment variables that users can configure.

Copilot generated this review using guidance from repository custom instructions.
const int topk = topk_ids_shape[topk_ids_shape.size() - 1];
const int expert_num = token_nums_per_expert.size();
return {{token_nums_this_rank, hidden_size},
{expert_num, m},
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output shape {expert_num, m} appears incorrect. Comparing to the original moe_ep_dispatch.cc, line 174 shows the correct shape should be {m, topk} for permute_indices_per_token. The dimensions are swapped, which will cause shape mismatch errors.

Suggested change
{expert_num, m},
{m, topk},

Copilot uses AI. Check for mistakes.
request["enable_thinking"] = True

data_processor_logger.info(f"Processed request dict: {request}")
data_processor_logger.info(f"Processed request dict: {request}, len(request.get('prompt_token_ids')): {len(request.get('prompt_token_ids'))}")
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This long log line should be split into multiple lines or simplified. Logging the entire request dict may expose sensitive information and create excessive log output. Consider logging only essential fields or using debug level for verbose output.

Copilot generated this review using guidance from repository custom instructions.
@@ -0,0 +1,76 @@
import json
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'json' is not used.

Suggested change
import json

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants