Skip to content

Conversation

chraac
Copy link

@chraac chraac commented May 19, 2024

In PR #6829, @rgerganov add support to rpc backend, after using it for several days, I have noticed an issue:

  • When the client closes the connection, the server does not free the memory it has allocated.

Upon investigating the source code, I discovered that instead of releasing the memory, we simply exit the inner loop and immediately wait for a new connection (ggml-rpc.cpp#L1027).

So here I create this PR, which monitor the ALLOC_BUFFER and FREE_BUFFER command, maintaining a list of allocated buffers, then free the remind buffer after client disconnect.

@@ -934,7 +938,7 @@ static void rpc_serve_client(ggml_backend_t backend, sockfd_t sockfd, size_t fre
}
switch (cmd) {
case ALLOC_BUFFER: {
rpc_alloc_buffer(backend, input, output);
allocated_buffers.push_back(rpc_alloc_buffer(backend, input, output));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the allocated buffer into list.

@@ -950,7 +954,7 @@ static void rpc_serve_client(ggml_backend_t backend, sockfd_t sockfd, size_t fre
break;
}
case FREE_BUFFER: {
rpc_free_buffer(input);
allocated_buffers.remove(rpc_free_buffer(input));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the freed buffer from list


for (auto buff: allocated_buffers) {
ggml_backend_buffer_free(buff);
}
Copy link
Author

@chraac chraac May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

free the reminding buffers.

@mofosyne mofosyne added bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix labels May 20, 2024
@rgerganov
Copy link
Collaborator

rgerganov commented May 20, 2024

I think a better approach would be to track allocated buffers with std::unordered_set, so we can also perform input validation in SET_TENSOR, `GET_TENSOR, etc. I will submit a patch shortly and add you as a reviewer.

@chraac
Copy link
Author

chraac commented May 20, 2024

I think a better approach would be to track allocated buffers with std::unordered_set, so we can also perform input validation in SET_TENSOR, `GET_TENSOR, etc. I will submit a patch shortly and add you as a reviewer.

emm, just forgot the SET_TENSOR and GET_TENSOR case, see there's also a tensor allocation in there.

also, for the std::unordered_set, thought the std::list will be okay here, since we won't have too many tensor, so this will no impact the free operation's performance.

but agree that std::unordered_set is a better alternative, let use it instead.

@chraac
Copy link
Author

chraac commented May 20, 2024

close this PR and wait for @rgerganov 's fix, related discussion: #7407

@chraac chraac closed this May 20, 2024
@chraac chraac mentioned this pull request May 20, 2024
@chraac chraac deleted the dev-fix-rpc-mem-alloc branch June 1, 2024 03:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants