-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Description
Info
Version: af0a5b6
Intel x86_64 with LLAMA_CUDA=1
Summary
When ./server
is given an invalid JSON payload at the /v1/chat/completions
route, server crashes with a segmentation fault. This denies access to clients until the server is restarted.
I stumbled upon this, and haven't thoroughly assessed all APIs or payload parameters for similar crashes. If it's easy enough to look for other routes that are missing the error handling that /v1/chat/completions
lacks, I think someone should do so (I'm not yet familiar enough with the codebase to look for these)
Example
$ gdb ./server
[... SNIP ...]
(gdb) r --model models/Meta-Llama-3-8B-Instruct.Q8_0.gguf --host 0.0.0.0
$ curl -X POST http://127.0.0.1:8081/v1/chat/completions -H 'Content-Type: application/json' --data '{}'
Thread 13 "server" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7efe71fff000 (LWP 567)]
0x000055e27db04601 in decltype (((from_json_array_impl({parm#1}, {parm#2}, (nlohmann::json_abi_v3_11_3::detail::priority_tag<3u>){})),(({parm#1}.(get<std::vector<nlohmann::json_abi_v3_11_3::basic_json<nlohmann::json_abi_v3_11_3::ordered_map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> >, void>, std::allocator<nlohmann::json_abi_v3_11_3::basic_json<nlohmann::json_abi_v3_11_3::ordered_map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> >, void> > >::value_type>))())),((void)())) nlohmann::json_abi_v3_11_3::detail::from_json<nlohmann::json_abi_v3_11_3::basic_json<nlohmann::json_abi_v3_11_3::ordered_map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> >, void>, std::vector<nlohmann::json_abi_v3_11_3::basic_json<nlohmann::json_abi_v3_11_3::ordered_map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> >, void>, std::allocator<nlohmann::json_abi_v3_11_3::basic_json<nlohmann::json_abi_v3_11_3::ordered_map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> >, void> > >, 0>(nlohmann::json_abi_v3_11_3::basic_json<nlohmann::json_abi_v3_11_3::ordered_map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> >, void> const&, std::vector<nlohmann::json_abi_v3_11_3::basic_json<nlohmann::json_abi_v3_11_3::ordered_map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> >, void>, std::allocator<nlohmann::json_abi_v3_11_3::basic_json<nlohmann::json_abi_v3_11_3::ordered_map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> >, void> > >&) ()
Impact
Given an llama.cpp ./server
endpoint, it can at least be crashed using an invalid payload. This denies the availability of the server and all API endpoints until it is restarted.
I have not assessed whether the segfault can have security impact beyond DoS.