ggml: add ggml_can_fuse_subgraph #16662

am17an · 2025-10-19T06:43:30Z

This PR adds ggml_can_fuse_subgraph which is a less strict extension of ggml_can_fuse. It checks given inputs/outputs of a subgraph, whether all the intermediate tensors can be fused. Putting as draft to iterate on the correct API

jeffbolznv · 2025-10-19T19:44:22Z

While this does check that the internal nodes aren't used outside of the fusion region, we also need to check that the internal connectivity of the graph is what we expect. IMO this necessarily has to be verbose, because we have to check that all the node->src[] values are what we expect them to be.

The first idea that comes to mind would be to pass in a list of triples, where each triple is a { dst_node, src_idx, src_node }, and verify that nodes[start + dst_node]->src[src_idx] == nodes[start + src_node].

am17an · 2025-10-20T02:24:37Z

While this does check that the internal nodes aren't used outside of the fusion region, we also need to check that the internal connectivity of the graph is what we expect. IMO this necessarily has to be verbose, because we have to check that all the node->src[] values are what we expect them to be.

I think this is better done at the caller site, which has more context about the fusion. This is just to avoid cases where the write is needed elsewhere we end up fusing it and have a common function for other sanity checks. Maybe a better a name for this function should be is_fusion_candidate. Passing triples like you mentioned is equivalent to building the graph at the caller site, but just not doing equality checks.

ggml/src/ggml.c

jeffbolznv · 2025-10-20T03:39:57Z

I think this is better done at the caller site

It's probably fine to have it as a separate function, but IMO it can still be common code (as much as any of this can be common code - there will always be special cases we want to handle differently).

2. add check for views: view_src should be part of the subgraph

am17an · 2025-10-20T14:54:00Z

ggml/src/ggml.c

+
+        // if node is a view, check if the view src is within the subgraph
+        if (node->view_src) {
+            const struct ggml_tensor * view_src = node->view_src;


Maybe need to walk the tree till we get the non-view parent instead of what I did here

I guess the most conservative thing would be to check all parents. It seems plausible we'll want to fuse a view of a view in the future.

jeffbolznv · 2025-10-20T15:13:07Z

ggml/src/ggml.c

+                                      const int                *  idxs,
+                                      int                        count,
+                                      const struct ggml_tensor * tensor) {
+    if (idxs == NULL || cgraph == NULL) {


IMO this could be removed or just be an assertion.

jeffbolznv · 2025-10-20T15:16:04Z

ggml/src/ggml.c

+
+        const struct ggml_tensor * node = cgraph->nodes[node_idxs[i]];
+
+        if (node->flags & GGML_TENSOR_FLAG_OUTPUT) {


Suggest moving this check after the ggml_find_tensor_node_list(cgraph, outputs, num_outputs, node) != -1 check. It's OK for the output nodes to have the output flag, since they won't be elided.

jeffbolznv · 2025-10-20T15:16:55Z

ggml/src/ggml.c

+            continue;
+        }
+
+        interior_nodes[interior_nodes_count++] = node_idxs[i];


I don't think it's necessary anymore to have the two loops and this interior_nodes array.

jeffbolznv · 2025-10-20T15:18:40Z

ggml/src/ggml.c

+
+        // if interior-node has n-uses, ensure that all of them lie within in this subgraph
+        int subgraph_uses = 0;
+        for (int j = 0; j < count; ++j) {


If you combine the two loops, this loop could start from j = i+1.

jeffbolznv · 2025-10-20T15:18:52Z

ggml/src/ggml.c

+        for (int j = 0; j < count; ++j) {
+            const struct ggml_tensor * other_node = cgraph->nodes[node_idxs[j]];
+            for (int src_idx = 0; src_idx < GGML_MAX_SRC; src_idx++) {
+                if (other_node->src[src_idx] && other_node->src[src_idx] == node) {


Suggested change

if (other_node->src[src_idx] && other_node->src[src_idx] == node) {

if (other_node->src[src_idx] == node) {

jeffbolznv · 2025-10-20T15:24:17Z

ggml/src/ggml.c

+
+        // if node is a view, check if the view src is within the subgraph
+        if (node->view_src) {
+            const struct ggml_tensor * view_src = node->view_src;


I guess the most conservative thing would be to check all parents. It seems plausible we'll want to fuse a view of a view in the future.

am17an requested review from ggerganov and slaren as code owners October 19, 2025 06:43

am17an marked this pull request as draft October 19, 2025 06:43

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Oct 19, 2025

am17an changed the title ~~Ggml can fuse subgraph~~ ggml: add ggml_can_fuse_subgraph Oct 19, 2025

am17an requested a review from jeffbolznv October 19, 2025 06:45

am17an added 3 commits October 20, 2025 01:52

ggml: add ggml_can_fuse_subgraph

b8a3661

ggml-cuda: use ggml_can_fuse_subgraph for topk-moe

578d918

format

ba472d1

jeffbolznv reviewed Oct 20, 2025

View reviewed changes

ggml/src/ggml.c Outdated Show resolved Hide resolved

ggml/src/ggml.c Show resolved Hide resolved

1. remove inputs from signature as they are transient nodes

d853036

2. add check for views: view_src should be part of the subgraph

am17an force-pushed the ggml_can_fuse_subgraph branch from 3059ed3 to d853036 Compare October 20, 2025 14:52

am17an commented Oct 20, 2025

View reviewed changes

jeffbolznv reviewed Oct 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml: add ggml_can_fuse_subgraph #16662

ggml: add ggml_can_fuse_subgraph #16662

am17an commented Oct 19, 2025 •

edited

Loading

Uh oh!

jeffbolznv commented Oct 19, 2025

Uh oh!

am17an commented Oct 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

jeffbolznv commented Oct 20, 2025

Uh oh!

am17an Oct 20, 2025

Uh oh!

jeffbolznv Oct 20, 2025

Uh oh!

jeffbolznv Oct 20, 2025

Uh oh!

jeffbolznv Oct 20, 2025

Uh oh!

jeffbolznv Oct 20, 2025

Uh oh!

jeffbolznv Oct 20, 2025

Uh oh!

jeffbolznv Oct 20, 2025

Uh oh!

jeffbolznv Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		const struct ggml_tensor * node = cgraph->nodes[node_idxs[i]];

		if (node->flags & GGML_TENSOR_FLAG_OUTPUT) {

	if (other_node->src[src_idx] && other_node->src[src_idx] == node) {
	if (other_node->src[src_idx] == node) {

ggml: add ggml_can_fuse_subgraph #16662

Are you sure you want to change the base?

ggml: add ggml_can_fuse_subgraph #16662

Conversation

am17an commented Oct 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffbolznv commented Oct 19, 2025

Uh oh!

am17an commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeffbolznv commented Oct 20, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

am17an commented Oct 19, 2025 •

edited

Loading

am17an commented Oct 20, 2025 •

edited

Loading