-
Notifications
You must be signed in to change notification settings - Fork 13k
Closed
Labels
Description
Observed behavior
If I send a string for the system prompt instead of the expected json object, the server terminates.
Desired behavior
The server responds with an HTTP 400 error and doesn't terminate.
Environment
Running the server via docker:
docker run -v /path/to/models:/models -p 8000:8000 ghcr.io/ggerganov/llama.cpp@sha256:b4675af8c9a8b3e7019a7baf536b95c3984a9aaacd0eafce7422377a299e31f4 -m /models/Meta-Llama-3-8B.Q4_K_M.gguf --port 8000 --host 0.0.0.0 -n 512
Using this gguf quant (though I don't think it matters -- it is crashing with different ggufs I have tried).
Request
json_body=$(cat <<EOF
{
"system_prompt": "Always reply in markdown lists.",
"prompt": "Building a website can be done in 10 simple steps:",
"n_predict": 128
}
EOF
)
curl --request POST \
--url http://localhost:8000/completion \
--header "Content-Type: application/json" \
--data "$json_body"
Server error logs
phantom-llama-1 | terminate called after throwing an instance of 'nlohmann::json_abi_v3_11_3::detail::type_error'
phantom-llama-1 | what(): [json.exception.type_error.306] cannot use value() with string
teleprint-me