Skip to content

Header prompt displayed using Llama3.1 with ollama #1484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
avirgos opened this issue Sep 24, 2024 · 3 comments
Closed

Header prompt displayed using Llama3.1 with ollama #1484

avirgos opened this issue Sep 24, 2024 · 3 comments
Labels
support A request for help setting things up

Comments

@avirgos
Copy link

avirgos commented Sep 24, 2024

Hello,

I'm using the llama3.1:latest model with ollama and I'm having trouble correctly initializing the chatPromptTemplate variable.

I used this Github issue to initialize this variable : #1035

Here is my .env.local file :

MONGODB_URL=mongodb://mongodb:27017
HF_TOKEN=<hf-token>

PUBLIC_APP_NAME=<name>

MODELS=`[
  {
    "name": "Ollama | Llama3.1",
    "chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}",
    "parameters": {
      "temperature": 0.1,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 3072,
      "max_new_tokens": 1024,
      "stop": ["<|end_of_text|>", "<|eot_id|>"]
    },
    "endpoints": [
      {
        "type": "ollama",
        "url" : "http://ollama:11434",
        "ollamaName" : "llama3.1:latest"
      }
    ]
  }
]`

But <|start_header_id|>assistant<|end_header_id|> appears on every response :

chat-ui-screen

Can you help me make it disappear by modifying chatPromptTemplate variable ?

Thanks in advance.

@avirgos avirgos added the support A request for help setting things up label Sep 24, 2024
@nsarrazin
Copy link
Collaborator

You're missing the initial generation prompt from your chatPromptTemplate !

I'd recommend setting tokenizer: 'meta-llama/Llama-3.1-70B-Instruct' in your model config and removing chatPromptTemplate.

If that's not possible, you can fix your chatPromptTemplate like so:

    "chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}<|start_header_id|>assistant<|end_header_id|>\n\n",

didn't try it but that should work, let me know!

@avirgos
Copy link
Author

avirgos commented Sep 30, 2024

You're missing the initial generation prompt from your chatPromptTemplate !

I'd recommend setting tokenizer: 'meta-llama/Llama-3.1-70B-Instruct' in your model config and removing chatPromptTemplate.

If that's not possible, you can fix your chatPromptTemplate like so:

    "chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}<|start_header_id|>assistant<|end_header_id|>\n\n",

didn't try it but that should work, let me know!

Hello @nsarrazin

I tried the tokenizer solution.

Here is the corresponding .env.local file :

MONGODB_URL=mongodb://mongodb:27017
HF_TOKEN=<hf-token>

PUBLIC_APP_NAME=<name>

MODELS=`[
  {
    "name": "Ollama | Llama3.1",
    "tokenizer": {
      "tokenizerUrl": "https://huggingface.co/nsarrazin/llama3.1-tokenizer/resolve/main/tokenizer.json",
      "tokenizerConfigUrl": "https://huggingface.co/nsarrazin/llama3.1-tokenizer/raw/main/tokenizer_config.json"
    },
    "parameters": {
      "temperature": 0.1,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 3072,
      "max_new_tokens": 1024,
      "stop": ["<|end_of_text|>", "<|eot_id|>"]
    },
    "endpoints": [
      {
        "type": "ollama",
        "url" : "http://ollama:11434",
        "ollamaName" : "llama3.1:latest"
      }
    ]
  }
]`

But this solution didn't work, so here's the corresponding log :

{"level":50,"time":1727685091785,"pid":30,"hostname":"2209111a4d83","err":{"type":"TypeError","message":"fetch failed: Connect Timeout Error (attempted addresses: 18.155.129.4:443, 18.155.129.31:443, 18.155.129.129:443, 18.155.129.60:443)","stack":"TypeError: fetch failed\n at node:internal/deps/undici/undici:13178:13\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\n at runNextTicks (node:internal/process/task_queues:64:3)\n at process.processImmediate (node:internal/timers:454:9)\n at async getModelFile (file:///app/node_modules/@huggingface/transformers/dist/transformers.mjs:26549:24)\n at async getModelJSON (file:///app/node_modules/@huggingface/transformers/dist/transformers.mjs:26653:18)\n at async Promise.all (index 0)\n at async loadTokenizer (file:///app/node_modules/@huggingface/transformers/dist/transformers.mjs:20190:18)\n at async AutoTokenizer.from_pretrained (file:///app/node_modules/@huggingface/transformers/dist/transformers.mjs:24498:50)\n at async getTokenizer (file:///app/build/server/chunks/getTokenizer-cTh4LwyT.js:5:12)\ncaused by: ConnectTimeoutError: Connect Timeout Error (attempted addresses: 18.155.129.4:443, 18.155.129.31:443, 18.155.129.129:443, 18.155.129.60:443)\n at onConnectTimeout (node:internal/deps/undici/undici:2331:28)\n at node:internal/deps/undici/undici:2283:50\n at Immediate._onImmediate (node:internal/deps/undici/undici:2315:13)\n at process.processImmediate (node:internal/timers:483:21)"},"msg":"Failed to load tokenizer for model Ollama | Llama3.1 consider setting chatPromptTemplate manually or making sure the model is available on the hub."}

So I tried the 2nd solution, which was to modify the chatPromptTemplate variable :

Here is the corresponding .env.local file :

MONGODB_URL=mongodb://mongodb:27017
HF_TOKEN=<hf-token>

PUBLIC_APP_NAME=<name>

MODELS=`[
  {
    "name": "Ollama | Llama3.1",
    "chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}<|start_header_id|>assistant<|end_header_id|>\n\n",
    "parameters": {
      "temperature": 0.1,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 3072,
      "max_new_tokens": 1024,
      "stop": ["<|end_of_text|>", "<|eot_id|>"]
    },
    "endpoints": [
      {
        "type": "ollama",
        "url" : "http://ollama:11434",
        "ollamaName" : "llama3.1:latest"
      }
    ]
  }
]`

And this second solution works !

Thank you for your help.

@nsarrazin
Copy link
Collaborator

Glad that it worked! I'll see why the first solution didn't work, it should have worked 👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
support A request for help setting things up
Projects
None yet
Development

No branches or pull requests

2 participants