Skip to content

LlamaEdge-RAG 0.11.1

Choose a tag to compare

@github-actions github-actions released this 21 Dec 15:58
· 78 commits to main since this release

Major changes:

  • (New) Support API key

    • Use API_KEY environment variable to set api-key when start API server, for example
      export LLAMA_API_KEY=12345-6789-abcdef
      wasmedge --dir .:. --env API_KEY=$LLAMA_API_KEY \
        --nn-preload default:GGML:AUTO:Llama-3.2-3B-Instruct-Q5_K_M.gguf \
        --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \
        rag-api-server.wasm \
        ...
    • Send each request with the corresponding api-key, for example
      curl --location 'http://localhost:8080/v1/chat/completions' \
      --header 'Authorization: Bearer 12345-6789-abcdef' \
      --header 'Content-Type: application/json' \
      --data '...'
  • (New) Add --context-window CLI option for specifying the maximum number of user messages for the context retrieval. Note that if the context_window field in the chat completion request appears, then ignore the setting of the CLI option.

    --context-window <CONTEXT_WINDOW>
              Maximum number of user messages used in the retrieval [default: 1]