Skip to content

Remove printing of prompt and prompt tokenization at startup #480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

slaren
Copy link
Member

@slaren slaren commented Mar 24, 2023

Now that the tokenizer has been tested fairly well, printing the tokenization on startup adds a lot of clutter for no good reason.

Additionally also removes printing of the prompt itself since it is already printed as it is evaluated anyway.

@ggerganov
Copy link
Member

The lines have to be either commented so we can easily re-enable - sometime it is still useful to look at the tokens.
I think it is best to make this on/off via command line arg

@anzz1
Copy link
Contributor

anzz1 commented Mar 25, 2023

Do not agree, a '--quiet' option would be better instead. It's very good information to know which tokens are generated when trying out different prompts and such. The tokens are vital in researching the whole thing.

Actually it would be a cool addition to add a cmdline option which would show the output tokenized along their id's to get a better understanding of different models and their differences. So another option is to go the other way around, instead of '--quiet' it would be a '-v' / '--verbose' option to show this info.

Also as the "debug" and "standard" outputs are already directed to different streams, you know that you can easily show only the generated output by redirecting the stderr output to a file or nul?

file:
main -m ./models/llama-13B-ggml/ggml-model-q4_0.bin 2> err.log

silent:
main -m ./models/llama-13B-ggml/ggml-model-q4_0.bin 2>nul (windows)
main -m ./models/llama-13B-ggml/ggml-model-q4_0.bin 2>/dev/null (unix)

In any case I agree that it definitely shouldn't be outright removed but made a command line option. But it's already easy to redirect the stderr to elsewhere if you want a 'clean look' when not testing/researching.

@ggerganov
Copy link
Member

Added a command line option for now: 502a400

@ggerganov ggerganov closed this Mar 25, 2023
@slaren slaren deleted the remove-tokenization branch March 26, 2023 18:56
AAbushady pushed a commit to AAbushady/llama.cpp that referenced this pull request Jan 27, 2024
* Update gpttype_adapter.cpp

* use n_vocab instead of 32000 for when top k is off
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants