-
Notifications
You must be signed in to change notification settings - Fork 13.4k
ggml: add ggml_can_fuse_subgraph #16662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
While this does check that the internal nodes aren't used outside of the fusion region, we also need to check that the internal connectivity of the graph is what we expect. IMO this necessarily has to be verbose, because we have to check that all the node->src[] values are what we expect them to be. The first idea that comes to mind would be to pass in a list of triples, where each triple is a |
I think this is better done at the caller site, which has more context about the fusion. This is just to avoid cases where the write is needed elsewhere we end up fusing it and have a common function for other sanity checks. Maybe a better a name for this function should be |
It's probably fine to have it as a separate function, but IMO it can still be common code (as much as any of this can be common code - there will always be special cases we want to handle differently). |
2. add check for views: view_src should be part of the subgraph
3059ed3
to
d853036
Compare
|
||
// if node is a view, check if the view src is within the subgraph | ||
if (node->view_src) { | ||
const struct ggml_tensor * view_src = node->view_src; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe need to walk the tree till we get the non-view parent instead of what I did here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the most conservative thing would be to check all parents. It seems plausible we'll want to fuse a view of a view in the future.
const int * idxs, | ||
int count, | ||
const struct ggml_tensor * tensor) { | ||
if (idxs == NULL || cgraph == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this could be removed or just be an assertion.
|
||
const struct ggml_tensor * node = cgraph->nodes[node_idxs[i]]; | ||
|
||
if (node->flags & GGML_TENSOR_FLAG_OUTPUT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest moving this check after the ggml_find_tensor_node_list(cgraph, outputs, num_outputs, node) != -1
check. It's OK for the output nodes to have the output flag, since they won't be elided.
continue; | ||
} | ||
|
||
interior_nodes[interior_nodes_count++] = node_idxs[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary anymore to have the two loops and this interior_nodes array.
|
||
// if interior-node has n-uses, ensure that all of them lie within in this subgraph | ||
int subgraph_uses = 0; | ||
for (int j = 0; j < count; ++j) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you combine the two loops, this loop could start from j = i+1
.
for (int j = 0; j < count; ++j) { | ||
const struct ggml_tensor * other_node = cgraph->nodes[node_idxs[j]]; | ||
for (int src_idx = 0; src_idx < GGML_MAX_SRC; src_idx++) { | ||
if (other_node->src[src_idx] && other_node->src[src_idx] == node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (other_node->src[src_idx] && other_node->src[src_idx] == node) { | |
if (other_node->src[src_idx] == node) { |
|
||
// if node is a view, check if the view src is within the subgraph | ||
if (node->view_src) { | ||
const struct ggml_tensor * view_src = node->view_src; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the most conservative thing would be to check all parents. It seems plausible we'll want to fuse a view of a view in the future.
This PR adds
ggml_can_fuse_subgraph
which is a less strict extension ofggml_can_fuse
. It checks given inputs/outputs of a subgraph, whether all the intermediate tensors can be fused. Putting as draft to iterate on the correct API