LlamaEdge-RAG 0.11.1
Major changes:
-
(New) Support API key
- Use
API_KEYenvironment variable to set api-key when start API server, for exampleexport LLAMA_API_KEY=12345-6789-abcdef wasmedge --dir .:. --env API_KEY=$LLAMA_API_KEY \ --nn-preload default:GGML:AUTO:Llama-3.2-3B-Instruct-Q5_K_M.gguf \ --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \ rag-api-server.wasm \ ...
- Send each request with the corresponding api-key, for example
curl --location 'http://localhost:8080/v1/chat/completions' \ --header 'Authorization: Bearer 12345-6789-abcdef' \ --header 'Content-Type: application/json' \ --data '...'
- Use
-
(New) Add
--context-windowCLI option for specifying the maximum number of user messages for the context retrieval. Note that if thecontext_windowfield in the chat completion request appears, then ignore the setting of the CLI option.--context-window <CONTEXT_WINDOW> Maximum number of user messages used in the retrieval [default: 1]