-
Notifications
You must be signed in to change notification settings - Fork 12.5k
Closed
Description
Hi, I try to get the token probabilities with latest code from main branch, compiled with cmake under linux, during compilation had some warnings (not imporant), but after run server binary and infer request, got empty complation_probabilites field.
Request:
curl --request POST \
--url http://192.168.41.197:8081/completion \
--header "Content-Type: application/json" \
--data '{"prompt": "Some stories about railway station","n_predict": 256, "n_probs" : 3}'
Response:
{"completion_probabilities":[],"content":" Rail statins ........ some content here","generation_settings":{"frequency_penalty":0.0,"grammar":"","ignore_eos":false,"logit_bias":[],"min_p":0.05000000074505806,"mirostat":0,"mirostat_eta":0.10000000149011612,"mirostat_tau":5.0,"model":"/home/max/DISK2/llama13b_200K/ggml-model-f16.gguf","n_ctx":512,"n_keep":0,"n_predict":256,"n_probs":3,"penalize_nl":true,"presence_penalty":0.0,"repeat_last_n":64,"repeat_penalty":1.100000023841858,"seed":4294967295,"stop":[],"stream":false,"temp":0.800000011920929,"tfs_z":1.0,"top_k":40,"top_p":0.949999988079071,"typical_p":1.0},"model":"/home/max/DISK2/llama13b_200K/ggml-model-f16.gguf","prompt":"Some stories about railway station","slot_id":0,"stop":true,"stopped_eos":true,"stopped_limit":false,"stopped_word":false,"stopping_word":"","timings":{"predicted_ms":3258.927,"predicted_n":55,"predicted_per_second":16.87672046658302,"predicted_per_token_ms":59.253218181818184,"prompt_ms":429.54,"prompt_n":20,"prompt_per_second":46.561437817199796,"prompt_per_token_ms":21.477},"tokens_cached":75,"tokens_evaluated":20,"tokens_predicted":55,"truncated":false}
Where I have an mistake? or it is a bug?
ibehnam, janpf and lisaFotofabriek
Metadata
Metadata
Assignees
Labels
No labels