-
Notifications
You must be signed in to change notification settings - Fork 12k
feat(rpc): compile-time op metadata & RPC graph validation #13167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We have to find a better solution - this is difficult to maintain because nothing guarantees that changes to the |
Good idea 👍 My initial thoughts revolved around utilising code generation, but how about this:
typedef struct {
int n_src; // Number of required source operands
// Potentially add other useful metadata later if needed
} ggml_op_metadata_t;
static const ggml_op_metadata_t GGML_OP_METADATA[GGML_OP_COUNT] = {
[GGML_OP_NONE] = {.n_src = 0},
[GGML_OP_DUP] = {.n_src = 1},
[GGML_OP_ADD] = {.n_src = 2},
[GGML_OP_ADD1] = {.n_src = 2},
...
static_assert(sizeof(GGML_OP_METADATA) / sizeof(GGML_OP_METADATA[0]) == GGML_OP_COUNT,
"GGML_OP_METADATA array size mismatch with GGML_OP_COUNT");
static inline int ggml_op_get_n_src(enum ggml_op op) {
if (op >= 0 && op < GGML_OP_COUNT) {
return GGML_OP_METADATA[op].n_src;
}
// Handle invalid op case if necessary, e.g., return -1
return -1; // Or assert/abort
} I'll park this as a draft in the meanwhile. |
Refactor `ggml_op` and `GGML_OP_METADATA` using X-Macros. This ensures compile-time synchronization between the enum and metadata. `ggml_op_metadata_check()` verifies this at compile time during `ggml_init`. This enables robust graph validation in the RPC server. Previously, malformed graphs (e.g., ADD with NULL src[1]) could cause crashes. `validate_graph_operands` now uses the X-Macro-generated metadata (`ggml_op_get_n_src`) to check for required non-null source operands before execution. Invalid graphs are rejected early. Adds `test_op_metadata_counts` to verify the metadata system. Signed-off-by: Ville Vesilehto <[email protected]>
86182fb
to
1b7d972
Compare
Added compile-time checks through X-macros. Looks like the build is failing on some platforms. I think the struct initialiser can be modified to fix this though. Early feedback on the approach would be much appreciated! |
This seems like a lot of complexity. The simpler approach from security perspective is to not have a unauthorized clients to submit graphs freely to the server. Then this validation would not be needed. |
I think it's not only about malicious input but incomplete graphs due to programmatic errors, which then end up crashing the RPC server in the process. |
Programmatically we should not allow to create graphs that are incomplete. If instead this was a correct assumption, it would require all backends to validate the graphs which is not practical.
|
Sure, I think that's a good improvement for anyone utilising the C++ interface. But you could still send incomplete input with valid wire format over the RPC interface, intentionally or not, without server-side validation present. Sounds like that's an accepted risk so I'll close this PR. |
Motivation:
When the RPC server receives a compute graph from a client and deserializes it, there's a possibility that the graph structure is malformed. Specifically, nodes might be missing required input operands (
src[0]
,src[1]
, etc.) for their specified operation (ggml_op
). Passing such an incomplete graph directly toggml_backend_graph_compute
could lead to crashes or undefined behavior within the backend implementation.Changes:
The
ggml_op
enum and theGGML_OP_METADATA
array (which stores metadata liken_src
- the number of source operands) are now automatically generated fromGGML_OP_LIST
by using X-Macros. This ensures they are always synchronized.A compile-time check function
ggml_op_metadata_check()
is added and called duringggml_init()
. This function uses a switch statement over allggml_op
enum values. If an operation is added to the enum but not toGGML_OP_LIST
(and thus not to the metadata), compilers configured to treat unhandled enum cases in a switch as an error (e.g., via -Werror=switch-enum) will flag this at compile time.For runtime checks this PR introduces a new static helper function
validate_graph_operands
withinggml-rpc.cpp
. This function iterates through the nodes of the deserialized graph before computation begins.ggml_op
of each node, it verifies that all requiredsrc[i]
operand pointers are non-null.validate_graph_operands
function, which is called fromrpc_server::graph_compute
immediately after graph deserialization and before callingggml_backend_graph_compute
.If validation fails, the computation request is rejected early, preventing the invalid graph from reaching the compute backend. The server will output the following:
Performance:
This adds a quick
O(N)
validation step before the main computation. This preprocessing overhead is expected to be negligible compared to the actual graph computation time but significantly improves the server's robustness against malformed client requests.The validation could also be done within compute kernels. I rejected the idea because:
if (src == nullptr)
) directly within performance-critical compute kernels could introduce branching overhead and potentially hinder optimizations. Potentially impacting overall computation performance.