LlamaEdge-RAG 0.11.1

github-actions released this 21 Dec 15:58

· 78 commits to main since this release

7e44986

Major changes:

(New) Support API key

Use API_KEY environment variable to set api-key when start API server, for example

export LLAMA_API_KEY=12345-6789-abcdef
wasmedge --dir .:. --env API_KEY=$LLAMA_API_KEY \
  --nn-preload default:GGML:AUTO:Llama-3.2-3B-Instruct-Q5_K_M.gguf \
  --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \
  rag-api-server.wasm \
  ...

Send each request with the corresponding api-key, for example

curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'Authorization: Bearer 12345-6789-abcdef' \
--header 'Content-Type: application/json' \
--data '...'

(New) Add --context-window CLI option for specifying the maximum number of user messages for the context retrieval. Note that if the context_window field in the chat completion request appears, then ignore the setting of the CLI option.
```
--context-window <CONTEXT_WINDOW>
          Maximum number of user messages used in the retrieval [default: 1]
```

Assets 4