[install-help]: Embedding server issue #178

hulkito-nol · 2025-04-29T21:41:44Z

Describe the issue
Hello,
Before asking I have read I think all the threads about this Nextcloud application, here and everywhere…
I use the last version of Nextcloud AIO on a QNAP ts-464 NAS.
For the AI integration, I use the openAI connector application with a MistralAi paid account.
Because my current installation of the Context Chat Backend seems not to work (" Failed request (500): Embedding Request Error: Error: the embedding server is not responding") , I wonder what is this Embedding server ? I have seen the related configuration in the config.yaml but do I need an external application (server) to use it ? is it an Ollama or Local AI instance ? or an internal server in the application itself ?
Thanks in advance for your precious answers

Setup Details (please complete the following information):

Nextcloud Version: 31.0.2
AppAPI Version: 5.0.2
Context Chat PHP Version php8.3
Context Chat Backend Version 4.2.0
Nextcloud deployment method: Docker AIO
Context Chat Backend deployment method one-click

hulkito-nol · 2025-05-03T23:11:29Z

Nobody to help me ? thanks in advance

kyteinsky · 2025-05-03T23:19:17Z

hello,
thanks for looking up the available resources well in advance. The docs will be updated very soon to make the embedding server known.
We use a technique called Retrieval-augmented generation (RAG) where we generate embeddings (vectors or numbers) from the text of the Nextcloud documents to help find them easily. The embedding server that is included inside the docker container in the default install does this for us.

The shown error message will be improved too but it means that the internal embedding server could not start for some reason. The embedder's log files in the docker container can give us a clue.
Running these commands should get those files:

docker exec -it nc_app_context_chat_backend bash
tail /nc_app_context_chat_backend_data/embedding_server_*

hulkito-nol · 2025-05-04T10:51:54Z

Hi, thank you for your help.
the only logs I have are these :
2025-05-04T10:49:13+0000: [ERROR|utils]: original traceback: Traceback (most recent call last): File "/app/context_chat_backend/utils.py", line 74, in exception_wrap resconn.send({ 'value': fun(*args, **kwargs), 'error': None }) ^^^^^^^^^^^^^^^^^^^^ File "/app/context_chat_backend/chain/one_shot.py", line 65, in process_context_query db = vectordb_loader.load() ^^^^^^^^^^^^^^^^^^^^^^ File "/app/context_chat_backend/dyn_loader.py", line 113, in load self.em_loader.load() File "/app/context_chat_backend/dyn_loader.py", line 92, in load raise EmbeddingException('Error: the embedding server is not responding') context_chat_backend.types.EmbeddingException: Error: the embedding server is not responding 2025-05-04T10:49:13+0000: [ERROR|controller]: Error occurred in an embedding request: /query: Traceback (most recent call last): File "/usr/local/lib/python3.11/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.11/dist-packages/starlette/routing.py", line 73, in app response = await f(request) ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/fastapi/routing.py", line 301, in app raw_response = await run_endpoint_function( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/fastapi/routing.py", line 214, in run_endpoint_function return await run_in_threadpool(dependant.call, **values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/starlette/concurrency.py", line 37, in run_in_threadpool return await anyio.to_thread.run_sync(func) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/anyio/_backends/_asyncio.py", line 2461, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/anyio/_backends/_asyncio.py", line 962, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/context_chat_backend/controller.py", line 183, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/app/context_chat_backend/controller.py", line 466, in _ return execute_query(query) ^^^^^^^^^^^^^^^^^^^^ File "/app/context_chat_backend/controller.py", line 455, in execute_query return exec_in_proc(target=target, args=args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/context_chat_backend/utils.py", line 97, in exec_in_proc raise result['error'] context_chat_backend.types.EmbeddingException: Error: the embedding server is not responding 2025-05-04T10:49:13+0000: [ERROR|utils]: Failed request (500): Embedding Request Error: Error: the embedding server is not responding INFO: 172.29.172.20:56048 - "POST /query HTTP/1.1" 500 Internal Server Error

The log files related to the embedding server are all empty....

kyteinsky · 2025-05-05T11:48:44Z

ah I'm sorry the path for the logs is incorrect. /logs/ is missing there.
The correct command would be

docker exec nc_app_context_chat_backend cat /nc_app_context_chat_backend_data/logs/embedding_server_*

hulkito-nol · 2025-05-06T14:29:07Z

Hi, as I saisd I have these logs but all are empty ...

kyteinsky · 2025-05-08T03:33:28Z

ah okay.
empty logs could mean it did not start at all. Would you mind posting your config?
docker exec nc_app_context_chat_backend cat /nc_app_context_chat_backend_data/config.yaml

and does your system have avx support? grep avx /proc/cpuinfo
also is anything running on port 5000? ss -tulpn | grep LISTEN | grep 5000

hulkito-nol · 2025-05-08T20:44:55Z

I have a QNAP TS-464, no AVX Docker support... so grep avx /proc/cpuinfo return nothing
ss -tulpn | grep LISTEN | grep 5000 :
tcp 0 0 :::5000 :::* LISTEN

I have tried to change the port in the config.yml but with no success

hulkito-nol · 2025-05-08T20:49:24Z

my config :

  GNU nano 6.2                                                                                                                                                                                                      config.yaml
# SPDX-FileCopyrightText: 2024 Nextcloud GmbH and Nextcloud contributors
# SPDX-License-Identifier: AGPL-3.0-or-later
debug: true
uvicorn_log_level: debug
disable_aaa: false
httpx_verify_ssl: false
use_colors: true
uvicorn_workers: 1
embedding_chunk_size: 2000
doc_parser_worker_limit: 10


vectordb:
  pgvector:
    # all options: https://python.langchain.com/api_reference/postgres/vectorstores/langchain_postgres.vectorstores.PGVector.html
    # 'connection' overrides the env var 'CCB_DB_URL'

embedding:
  protocol: http
  host: 192.168.1.179
  port: 6787
  workers: 1
  offload_after_mins: 15 # in minutes
  request_timeout: 1800 # in seconds
  llama:
    # all options: https://python.langchain.com/api_reference/community/embeddings/langchain_community.embeddings.llamacpp.LlamaCppEmbeddings.html
    # 'model_alias' is reserved
    # 'embedding' is always set to True
    model: multilingual-e5-large-instruct-q6_k.gguf
    n_batch: 16
    n_ctx: 8192

llm:
  nc_texttotext:

  llama:
    # all options: https://python.langchain.com/api_reference/community/llms/langchain_community.llms.llamacpp.LlamaCpp.html
    model_path: dolphin-2.2.1-mistral-7b.Q5_K_M.gguf
    n_batch: 512
    n_ctx: 8192
    max_tokens: 4096
    template: "<|im_start|> system \nYou're an AI assistant named Nextcloud Assistant, good at finding relevant context from documents to answer questions provided by the user. <|im_end|>\n<|im_start|> user\nUse the following documents as context to answer the question at the end. REMEMBER to excersice source critisicm as the documents are returned by a search provider that can return unrelated documents.\n\nSTART OF CONTEXT:>
    no_ctx_template: "<|im_start|> system \nYou're an AI assistant named Nextcloud Assistant.<|im_end|>\n<|im_start|> user\n{question}<|im_end|>\n<|im_start|> assistant\n"
    end_separator: "<|im_end|>"

  ctransformer:
    # all options: https://python.langchain.com/api_reference/community/llms/langchain_community.llms.ctransformers.CTransformers.html
    model: dolphin-2.2.1-mistral-7b.Q5_K_M.gguf
    template: "<|im_start|> system \nYou're an AI assistant named Nextcloud Assistant, good at finding relevant context from documents to answer questions provided by the user. <|im_end|>\n<|im_start|> user\nUse the following documents as context to answer the question at the end. REMEMBER to excersice source critisicm as the documents are returned by a search provider that can return unrelated documents.\n\nSTART OF CONTEXT:>
    no_ctx_template: "<|im_start|> system \nYou're an AI assistant named Nextcloud Assistant.<|im_end|>\n<|im_start|> user\n{question}<|im_end|>\n<|im_start|> assistant\n"
    end_separator: "<|im_end|>"
    config:
      context_length: 8192
      max_new_tokens: 4096
      local_files_only: True

  hugging_face:
    # all options: https://python.langchain.com/api_reference/community/llms/langchain_community.llms.huggingface_pipeline.HuggingFacePipeline.html
    model_id: gpt2
    task: text-generation
    pipeline_kwargs:
      config:
        max_length: 200
    template: ""

kyteinsky · 2025-05-08T22:28:27Z

I have a QNAP TS-464, no AVX Docker support... so grep avx /proc/cpuinfo return nothing

well that's a bummer. We don't support systems without AVX.
If this were a manual setup, you could re-compile llama-cpp-python without AVX support with CMAKE_ARGS="-DLLAMA_AVX2=OFF" pip install llama-cpp-python (untested). See https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#installation-configuration and abetlen/llama-cpp-python#284 (comment)

host: 192.168.1.179

Which IP has been used here? Can you try with host: localhost?
The embedding server is to be primarily accessed by the python ccb server so this would be enough to get it started at least.

hulkito-nol · 2025-05-08T23:40:17Z

its the IP of my host .but I have tried localhost,127.0.0.1, 0.0.0.0... same result

this is not a manual setup .I have install the context chat backend dorectly from the Nextcloud application

hulkito-nol · 2025-05-08T23:44:49Z

is it to execute in the container or do I rebuild a new image ?

If this were a manual setup, you could re-compile llama-cpp-python without AVX support with CMAKE_ARGS="-DLLAMA_AVX2=OFF" pip install llama-cpp-python (untested).

kyteinsky · 2025-05-15T12:13:17Z

can you confirm is nothing is running on port 6787? And if something on your system is preventing the binding of the port like selinux or apparmor.

sudo lsof -i :6787
sudo netstat -ltnup  | grep 6787

also, the logs should show the exception message if the embedding server does not start. Would you mind upgrading context_chat_backend to 4.3.0 and posting the logs?

is it to execute in the container or do I rebuild a new image ?

to execute in the container. Unfortunately it has to be done after every update. We might change things in the long run to automate it for people wanting to customize the build process of llama-cpp-python.

hulkito-nol added the help wanted Extra attention is needed label Apr 29, 2025

hulkito-nol changed the title ~~[install-help]: <short description>~~ [install-help]: Embedding server issue Apr 29, 2025

kyteinsky mentioned this issue May 15, 2025

[feat]: build llama-cpp-python instead of using the whl #182

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[install-help]: Embedding server issue #178

[install-help]: Embedding server issue #178

hulkito-nol commented Apr 29, 2025

hulkito-nol commented May 3, 2025

Uh oh!

kyteinsky commented May 3, 2025 •

edited

Loading

Uh oh!

hulkito-nol commented May 4, 2025

Uh oh!

kyteinsky commented May 5, 2025

Uh oh!

hulkito-nol commented May 6, 2025

Uh oh!

kyteinsky commented May 8, 2025 •

edited

Loading

Uh oh!

hulkito-nol commented May 8, 2025 •

edited

Loading

Uh oh!

hulkito-nol commented May 8, 2025

Uh oh!

kyteinsky commented May 8, 2025

Uh oh!

hulkito-nol commented May 8, 2025 •

edited

Loading

Uh oh!

hulkito-nol commented May 8, 2025 •

edited

Loading

Uh oh!

kyteinsky commented May 15, 2025

Uh oh!

[install-help]: Embedding server issue #178

[install-help]: Embedding server issue #178

Comments

hulkito-nol commented Apr 29, 2025

hulkito-nol commented May 3, 2025

Uh oh!

kyteinsky commented May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hulkito-nol commented May 4, 2025

Uh oh!

kyteinsky commented May 5, 2025

Uh oh!

hulkito-nol commented May 6, 2025

Uh oh!

kyteinsky commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hulkito-nol commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hulkito-nol commented May 8, 2025

Uh oh!

kyteinsky commented May 8, 2025

Uh oh!

hulkito-nol commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hulkito-nol commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kyteinsky commented May 15, 2025

Uh oh!

kyteinsky commented May 3, 2025 •

edited

Loading

kyteinsky commented May 8, 2025 •

edited

Loading

hulkito-nol commented May 8, 2025 •

edited

Loading

hulkito-nol commented May 8, 2025 •

edited

Loading

hulkito-nol commented May 8, 2025 •

edited

Loading