Skip to content

scripts: fix pattern and get n_tokens in one go #10221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 9, 2024

Conversation

lhpqaq
Copy link
Contributor

@lhpqaq lhpqaq commented Nov 8, 2024

Implement a todo and fix #10219

@lhpqaq
Copy link
Contributor Author

lhpqaq commented Nov 8, 2024

@slaren @ggerganov PTAL~

@reinvantveer
Copy link

reinvantveer commented Nov 8, 2024

Can confirm that this solves the error message and exit 1 status reported in #10219

@reinvantveer
Copy link

Thank you @lhpqaq !

@lhpqaq
Copy link
Contributor Author

lhpqaq commented Nov 8, 2024

@reinvantveer thanks for your review

@reinvantveer
Copy link

I'm not sure but I suspect there's some kind of regression though. The script appears to read input where I haven't given any

@reinvantveer
Copy link

A quote from the "interactive storytelling" I'm trying out with the model

As you're planning, you hear a noise coming from outside. It sounds like someone is approaching.
Do you:
A) Investigate the noise
B) Ignore it and continue planning
C) Abort the plan
D) Prepare to defend yourself
User:




main: saving final output to session file './chat/rein/current-cache.bin'



User:




main: saving final
main: saving final output to session file './chat/rein/current-cache-bin'



User:




main: saving final output to session file './chat/rein/current-cache-bin'



User:




main: saving final output to session file './chat/rein/current-cache-bin'



User:

main: saving final output to session file './chat/rein/current-cache.bin'



User: A
hatLLaMa:  You decide to investigate the noise. You and Samantha carefully make your way to the door and listen intently. The noise sounds like footsteps, but they're light and cautious. It's clear that whoever it is, they're trying not to be seen.

@reinvantveer
Copy link

A quote from the "interactive storytelling"

I'm unsure on where these extra main: saving final output to session file originate, whether it's something introduced by the changes or if it's a different bug

@lhpqaq
Copy link
Contributor Author

lhpqaq commented Nov 8, 2024

A quote from the "interactive storytelling"

I'm unsure on where these extra main: saving final output to session file originate, whether it's something introduced by the changes or if it's a different bug

This is from llama-cli (main. cpp), this PR only modifies regular pattern matching

@reinvantveer
Copy link

this PR only modifies regular pattern matching

I see, I switched back to the master branch and my chat is now stuck in an infinite loop of

User:


main: saving final output to session file './chat/rein2/current-cache.bin'



User:


main: saving final output to session file './chat/rein2/current-cache.bin'



User:


main: saving final output to session file './chat/rein2/current-cache.bin'



User:

main: saving final output to session file './chat/rein2/current-cache.bin'

... [etc]

@lhpqaq
Copy link
Contributor Author

lhpqaq commented Nov 8, 2024

After each conversation, the script infers once in the background, but only redirects the standard error to the log, which saving final output to session file is output to the standard output

    # Update cache for next prompt in background, ideally during user input
    ./llama-cli >>"$LOG_BG" 2>&1 "${OPTS[@]}" \
          --prompt-cache "$NEXT_PROMPT_CACHE" \
          --file "$NEXT_PROMPT_FILE" \
          --n_predict 1 &

@reinvantveer There are indeed bugs here

@ggerganov ggerganov merged commit 8fc393f into ggml-org:master Nov 9, 2024
1 check passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Couldn't get number of tokens from ./llama-cli output!
3 participants