Releases · LlamaEdge/rag-api-server

Support keyword search
- (NEW) Add --kw-search-url CLI option for specifying the url of keyword search server
- (BREAKING) Change the type of response body returned by /v1/create/rag endpoint from EmbeddingResponse to CreateRagResponse
Upgrade dependencies
- chat-prompts v0.19.1
- endpoints v0.24.1
- llama-core v0.26.2

Assets 4

13 Jan 06:51

github-actions

0.12.1

13f1ecf

LlamaEdge-RAG 0.12.1

Major changes:

(NEW) Add the --ubatch-size CLI option

Assets 4

09 Jan 14:30

github-actions

0.12.0

ccae7a3

LlamaEdge-RAG 0.12.0

Major changes:

(New) Add the --split-mode CLI option
(BREAKING) Update the --n-predict CLI option
- Update the type to i32
- Update the default value to -1. Keep it consistent with the --n-predict CLI option of llama.cpp
Upgrade deps:
- llama-core v0.26.0
- chat-prompts v0.19.0
- endpoints v0.24.0

Assets 4

06 Jan 18:15

github-actions

0.11.2

010bcf4

LlamaEdge-RAG 0.11.2

Major changes:

Upgrade deps:
- llama-core v0.25.3
- chat-prompts v0.18.6
- endpoints v0.23.2

Assets 4

21 Dec 15:58

github-actions

0.11.1

7e44986

LlamaEdge-RAG 0.11.1

Major changes:

(New) Support API key

Use API_KEY environment variable to set api-key when start API server, for example

export LLAMA_API_KEY=12345-6789-abcdef
wasmedge --dir .:. --env API_KEY=$LLAMA_API_KEY \
  --nn-preload default:GGML:AUTO:Llama-3.2-3B-Instruct-Q5_K_M.gguf \
  --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \
  rag-api-server.wasm \
  ...

Send each request with the corresponding api-key, for example

curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'Authorization: Bearer 12345-6789-abcdef' \
--header 'Content-Type: application/json' \
--data '...'

(New) Add --context-window CLI option for specifying the maximum number of user messages for the context retrieval. Note that if the context_window field in the chat completion request appears, then ignore the setting of the CLI option.
```
--context-window <CONTEXT_WINDOW>
          Maximum number of user messages used in the retrieval [default: 1]
```

Assets 4

Releases: LlamaEdge/rag-api-server

LlamaEdge-RAG 0.13.5

Uh oh!

LlamaEdge-RAG 0.13.4

Uh oh!

LlamaEdge-RAG 0.13.3

Uh oh!

LlamaEdge-RAG 0.13.2

Uh oh!

LlamaEdge-RAG 0.13.1

Uh oh!

LlamaEdge-RAG 0.13.0

Uh oh!

LlamaEdge-RAG 0.12.1

Uh oh!

LlamaEdge-RAG 0.12.0

Uh oh!

LlamaEdge-RAG 0.11.2

Uh oh!

LlamaEdge-RAG 0.11.1

Uh oh!