Header prompt displayed using Llama3.1 with ollama #1484

avirgos · 2024-09-24T13:33:16Z

Hello,

I'm using the llama3.1:latest model with ollama and I'm having trouble correctly initializing the chatPromptTemplate variable.

I used this Github issue to initialize this variable : #1035

Here is my .env.local file :

MONGODB_URL=mongodb://mongodb:27017
HF_TOKEN=<hf-token>

PUBLIC_APP_NAME=<name>

MODELS=`[
  {
    "name": "Ollama | Llama3.1",
    "chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}",
    "parameters": {
      "temperature": 0.1,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 3072,
      "max_new_tokens": 1024,
      "stop": ["<|end_of_text|>", "<|eot_id|>"]
    },
    "endpoints": [
      {
        "type": "ollama",
        "url" : "http://ollama:11434",
        "ollamaName" : "llama3.1:latest"
      }
    ]
  }
]`

But <|start_header_id|>assistant<|end_header_id|> appears on every response :

Can you help me make it disappear by modifying chatPromptTemplate variable ?

Thanks in advance.

The text was updated successfully, but these errors were encountered:

nsarrazin · 2024-09-27T14:51:02Z

You're missing the initial generation prompt from your chatPromptTemplate !

I'd recommend setting tokenizer: 'meta-llama/Llama-3.1-70B-Instruct' in your model config and removing chatPromptTemplate.

If that's not possible, you can fix your chatPromptTemplate like so:

    "chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}<|start_header_id|>assistant<|end_header_id|>\n\n",

didn't try it but that should work, let me know!

avirgos · 2024-09-30T07:22:30Z

You're missing the initial generation prompt from your chatPromptTemplate !

I'd recommend setting tokenizer: 'meta-llama/Llama-3.1-70B-Instruct' in your model config and removing chatPromptTemplate.

If that's not possible, you can fix your chatPromptTemplate like so:
    "chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}<|start_header_id|>assistant<|end_header_id|>\n\n",
didn't try it but that should work, let me know!

Hello @nsarrazin

I tried the tokenizer solution.

Here is the corresponding .env.local file :

MONGODB_URL=mongodb://mongodb:27017
HF_TOKEN=<hf-token>

PUBLIC_APP_NAME=<name>

MODELS=`[
  {
    "name": "Ollama | Llama3.1",
    "tokenizer": {
      "tokenizerUrl": "https://huggingface.co/nsarrazin/llama3.1-tokenizer/resolve/main/tokenizer.json",
      "tokenizerConfigUrl": "https://huggingface.co/nsarrazin/llama3.1-tokenizer/raw/main/tokenizer_config.json"
    },
    "parameters": {
      "temperature": 0.1,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 3072,
      "max_new_tokens": 1024,
      "stop": ["<|end_of_text|>", "<|eot_id|>"]
    },
    "endpoints": [
      {
        "type": "ollama",
        "url" : "http://ollama:11434",
        "ollamaName" : "llama3.1:latest"
      }
    ]
  }
]`

But this solution didn't work, so here's the corresponding log :

{"level":50,"time":1727685091785,"pid":30,"hostname":"2209111a4d83","err":{"type":"TypeError","message":"fetch failed: Connect Timeout Error (attempted addresses: 18.155.129.4:443, 18.155.129.31:443, 18.155.129.129:443, 18.155.129.60:443)","stack":"TypeError: fetch failed\n at node:internal/deps/undici/undici:13178:13\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\n at runNextTicks (node:internal/process/task_queues:64:3)\n at process.processImmediate (node:internal/timers:454:9)\n at async getModelFile (file:///app/node_modules/@huggingface/transformers/dist/transformers.mjs:26549:24)\n at async getModelJSON (file:///app/node_modules/@huggingface/transformers/dist/transformers.mjs:26653:18)\n at async Promise.all (index 0)\n at async loadTokenizer (file:///app/node_modules/@huggingface/transformers/dist/transformers.mjs:20190:18)\n at async AutoTokenizer.from_pretrained (file:///app/node_modules/@huggingface/transformers/dist/transformers.mjs:24498:50)\n at async getTokenizer (file:///app/build/server/chunks/getTokenizer-cTh4LwyT.js:5:12)\ncaused by: ConnectTimeoutError: Connect Timeout Error (attempted addresses: 18.155.129.4:443, 18.155.129.31:443, 18.155.129.129:443, 18.155.129.60:443)\n at onConnectTimeout (node:internal/deps/undici/undici:2331:28)\n at node:internal/deps/undici/undici:2283:50\n at Immediate._onImmediate (node:internal/deps/undici/undici:2315:13)\n at process.processImmediate (node:internal/timers:483:21)"},"msg":"Failed to load tokenizer for model Ollama | Llama3.1 consider setting chatPromptTemplate manually or making sure the model is available on the hub."}

So I tried the 2nd solution, which was to modify the chatPromptTemplate variable :

Here is the corresponding .env.local file :

MONGODB_URL=mongodb://mongodb:27017
HF_TOKEN=<hf-token>

PUBLIC_APP_NAME=<name>

MODELS=`[
  {
    "name": "Ollama | Llama3.1",
    "chatPromptTemplate": "<|begin_of_text|>{{#if @root.preprompt}}<|start_header_id|>system<|end_header_id|>\n\n{{@root.preprompt}}<|eot_id|>{{/if}}{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>\n\n{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}<|start_header_id|>assistant<|end_header_id|>\n\n",
    "parameters": {
      "temperature": 0.1,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 3072,
      "max_new_tokens": 1024,
      "stop": ["<|end_of_text|>", "<|eot_id|>"]
    },
    "endpoints": [
      {
        "type": "ollama",
        "url" : "http://ollama:11434",
        "ollamaName" : "llama3.1:latest"
      }
    ]
  }
]`

And this second solution works !

Thank you for your help.

nsarrazin · 2024-09-30T08:43:05Z

Glad that it worked! I'll see why the first solution didn't work, it should have worked 👀

avirgos added the support A request for help setting things up label Sep 24, 2024

nsarrazin closed this as completed Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Header prompt displayed using Llama3.1 with ollama #1484

Header prompt displayed using Llama3.1 with ollama #1484

avirgos commented Sep 24, 2024

nsarrazin commented Sep 27, 2024

Uh oh!

avirgos commented Sep 30, 2024

Uh oh!

nsarrazin commented Sep 30, 2024

Uh oh!

Header prompt displayed using Llama3.1 with ollama #1484

Header prompt displayed using Llama3.1 with ollama #1484

Comments

avirgos commented Sep 24, 2024

nsarrazin commented Sep 27, 2024

Uh oh!

avirgos commented Sep 30, 2024

Uh oh!

nsarrazin commented Sep 30, 2024

Uh oh!