[models][deepseek][qwen2.5] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models #40

orionpapadakis · 2025-08-01T18:16:30Z

No description provided.

mikepapadim · 2025-08-29T11:13:15Z

Fixes #19

Copilot

Pull Request Overview

This PR adds support for Qwen2.5 and Deepseek-Distilled-Qwen models to the LLaMA inference framework. It introduces new model types, loaders, and computation kernels to handle these model architectures with their specific requirements.

Key changes:

Added new model types QWEN_2 and DEEPSEEK_R1_DISTILL_QWEN with corresponding configurations and state management
Implemented specialized TornadoVM computation kernels for Qwen2 models including bias addition operations
Added automatic reasoning token injection for Deepseek-R1-Distill-Qwen models

Reviewed Changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
TransformerComputeKernelsLayered.java	Added `addInPlace` kernel for element-wise array addition
TornadoVMMasterPlan.java	Added Qwen2 planner support and refactored model type dispatching
Qwen2TornadoVMLayerPlanner.java	New TornadoVM layer planner for Qwen2 models with bias operations
Qwen3Tokenizer.java	Updated token display logic to include reasoning tokens
Qwen3.java	Added `shouldAddBeginOfText` override
Qwen2Configuration.java	New configuration record for Qwen2 models
Qwen2.java	Main Qwen2 model implementation with DeepSeek-R1 specific behavior
Phi3.java	Added `shouldAddBeginOfText` override
Qwen2ModelLoader.java	Model loader for Qwen2/DeepSeek models with bias weight handling
ModelLoader.java	Updated model type detection logic
ModelType.java	Added new model types and DeepSeek detection
Model.java	Added reasoning token injection for DeepSeek models
Qwen2TornadoWeights.java	TornadoVM weights implementation for Qwen2
Qwen2StandardWeights.java	Standard weights implementation for Qwen2
Qwen2State.java	State management for Qwen2 models
InferenceCore.java	Java inference implementation for Qwen2

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

src/main/java/org/beehive/gpullama3/tornadovm/Qwen2TornadoVMLayerPlanner.java

Copilot · 2025-08-29T11:14:31Z

src/main/java/org/beehive/gpullama3/tornadovm/Qwen2TornadoVMLayerPlanner.java

+                    .task("qbias", TransformerComputeKernelsLayered::addInPlace, state.wrapQ, weights.q_biasLayered[layerIndex], config.dim())
+                    .task("kbias", TransformerComputeKernelsLayered::addInPlace, state.wrapK, weights.k_biasLayered[layerIndex], config.kvDim())
+                    .task("vbias", TransformerComputeKernelsLayered::addInPlace, state.wrapV, weights.v_biasLayered[layerIndex], config.kvDim())
+                    .task("rope", Qwen3Kernels::ropeRotation,context, state.positionHolder, state.wrapQ, state.wrapK, config.numberOfKeyValueHeads(),


Using Qwen3Kernels::ropeRotation for Qwen2 models may be incorrect. Verify that Qwen2 and Qwen3 use identical RoPE implementations, or create a Qwen2-specific RoPE kernel.

Copilot · 2025-08-29T11:14:31Z

src/main/java/org/beehive/gpullama3/tornadovm/TornadoVMMasterPlan.java

-    /**
-     * Executes the forward pass of a LLaMA transformer model using TornadoVM acceleration. This method processes the transformer layers in sequence for a particular token position in the context
+     * Executes the forward pass of a LLaMA transformer model using TornadoVM acceleration.
+     *This method processes the transformer layers in sequence for a particular token position in the context


Missing space after the asterisk in the comment. Should be '* This method' instead of '*This method'.

Copilot · 2025-08-29T11:14:31Z

src/main/java/org/beehive/gpullama3/model/qwen2/Qwen2Configuration.java

+
+    @Override
+    public int contextLengthModel() {
+        return contextLengthModel;


The method returns the field contextLengthModel but should return the parameter contextLengthModel() to match the interface contract. This creates infinite recursion.

Copilot · 2025-08-29T11:14:31Z

src/main/java/org/beehive/gpullama3/model/loader/Qwen2ModelLoader.java

+        try (var ignored = Timer.log("Load " + modelName + " model")) {
+            // reuse method of Qwen3
+            Vocabulary vocabulary = loadQwen3Vocabulary(metadata);
+            boolean isDeepSeekR1DistillQwen = "DeepSeek-R1-Distill-Qwen".equals(metadata.get("general.basename"));


The string literal 'DeepSeek-R1-Distill-Qwen' is duplicated on lines 42 and 49. Consider extracting it to a constant to avoid duplication.

…yerPlanner.java Co-authored-by: Copilot <[email protected]>

orionpapadakis requested a review from mikepapadim August 1, 2025 18:16

orionpapadakis added the models label Aug 1, 2025

mikepapadim assigned orionpapadakis Aug 1, 2025

mikepapadim changed the title ~~[WIP] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models~~ [WIP][models][deepseek][qwen2.5] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models Aug 1, 2025

mikepapadim requested a review from Copilot August 4, 2025 10:50

This comment was marked as outdated.

Sign in to view

mikepapadim requested a review from Copilot August 6, 2025 13:02

This comment was marked as outdated.

Sign in to view

orionpapadakis added 10 commits August 6, 2025 16:25

Add model loader for qwen2

c878541

Add weights for qwen2

177476a

Add state for qwen2

5a3ab76

Add class for qwen2

6adc02c

Add configuration class for qwen2

f96ddc4

Add forward method for Qwen2

c4562ad

Extend logic for Qwen2

35b993e

Distinct Deepseek-R1-Distill-Qwen from Qwen2

d1239eb

Fix reasoning management in Deepseek-R1-Distill-Qwen and Qwen models

1fba5bf

Fix refactor issues

3dd3474

orionpapadakis force-pushed the feat/qwen2 branch from 36ddcd6 to 3dd3474 Compare August 6, 2025 13:53

orionpapadakis added 6 commits August 7, 2025 14:31

Remove redundant if conditions

4ef777c

Add loadWeights method for Qwen2

4029373

Fix Load XXX model message in model loader

1e1ec8a

Introduce TornadoWeights for Qwen2

09f1b4d

Initial commit for Qwen2TornadoVMLayerPlanner

abc1b2b

Add missing pieces for Qwen2.5 & Deepseek-r1-distill-qwen with tornado

eead727

mikepapadim changed the title ~~[WIP][models][deepseek][qwen2.5] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models~~ [models][deepseek][qwen2.5] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models Aug 29, 2025

mikepapadim marked this pull request as ready for review August 29, 2025 11:12

mikepapadim requested a review from Copilot August 29, 2025 11:13

Copilot AI reviewed Aug 29, 2025

View reviewed changes

Update readme with model links

1403b4f

mikepapadim approved these changes Aug 29, 2025

View reviewed changes

Update src/main/java/org/beehive/gpullama3/tornadovm/Qwen2TornadoVMLa…

34d2c1e

…yerPlanner.java Co-authored-by: Copilot <[email protected]>

mikepapadim merged commit 9a4a81d into beehive-lab:main Sep 1, 2025
1 check passed

mikepapadim mentioned this pull request Sep 4, 2025

Support for DeepSeek models #41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[models][deepseek][qwen2.5] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models #40

[models][deepseek][qwen2.5] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models #40

Uh oh!

orionpapadakis commented Aug 1, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

mikepapadim commented Aug 29, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Aug 29, 2025

Uh oh!

Copilot AI Aug 29, 2025

Uh oh!

Copilot AI Aug 29, 2025

Uh oh!

Copilot AI Aug 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[models][deepseek][qwen2.5] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models #40

[models][deepseek][qwen2.5] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models #40

Uh oh!

Conversation

orionpapadakis commented Aug 1, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

mikepapadim commented Aug 29, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants