Skip to content

User should be able to return control without inserting a newline #587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
ScarletEmerald opened this issue Mar 29, 2023 · 8 comments
Closed
4 tasks done
Labels
enhancement New feature or request stale

Comments

@ScarletEmerald
Copy link

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Current Behavior

In Interactive mode, if the user presses return, a newline is entered into the input. This makes it impossible to return control without inserting a newline.

A special-case exception was recently added for when the user enters only a newline and no other text. #529 #571 This returns control without inserting a newline. However, this only works for the special case of zero length input.

Expected Behavior

The newline entered by the user to return control should never be included in the text. If the user wants the text to end in a newline, this should be accomplished by explicitly adding a newline by using \ followed by return, then returning control by pressing return again.

This is what would be desired in Interactive mode. In Instruct mode, perhaps the desired behavior would the current behavior of including the final newlines.

@anzz1
Copy link
Contributor

anzz1 commented Mar 29, 2023

Actually, the length check isn't checking for zero-length input:
https://github.com/ggerganov/llama.cpp/blob/5a5f8b1501fbb34367225544010ddfc306d6d2fe/examples/main/main.cpp#L401
> 1 means "greater than" one. So 2 or more characters.

The std::getline() function only returns the control from the user back to the program after you have already input the newline character. So the program sees and processes the newline character after you have already seen it. We do ignore this, so while you see it, it isn't actually added to the input stream evaluated by the model.

I did think about this and indeed had exactly the same thought of the current solution not being optimal, especially in the case of instruct mode.

However, the problem is that is how terminals are supposed to work and there isn't really a good portable solution to this without going into platform specifics and even requiring extra libraries which seems to be more trouble than worth. As you would need to either read the keyboard input directly without using the getline func or rollback the terminal after the fact.

For Windows I know how to proceed, the low-level keyboard input can be read using GetAsyncKeyState or the console equivalent _kbhit. That would negate the need for rolling back the terminal. Rolling back could also be done simply by using FillConsoleOutputCharacter and SetConsoleCursorPosition.

However, those are for Windows only and I don't know how to implement those in Linux but a quick search shows me that it can be problematic and involves tricks or libraries Using kbhit() and getch() on Linux

This code base doesn't implement any non-portable functionality (which is a good thing) so I haven't implemented this for now for Windows.

I don't think a good, portable and easily maintainable solution exist for this, so you're kinda stuck with the newlines so far. But if someone else can think of one, that would be great.

@thement
Copy link
Contributor

thement commented Mar 29, 2023

The std::getline() function only returns the control from the user back to the program after you have already input the newline character.

I don't think this is a buffering issue. Users probably don't want the characters to be sent to llama tokenizer as they type them in.

@ScarletEmerald
So if I understand you correctly, you would like this keyboard input:
word<enter> to input "word" into the LLaMa tokenizer? (where <enter> is actually hitting enter and the input is C-style string)
and word\<enter><enter> should input "word\n" into LLaMa tokenizer?

Some programming languages (C, Rust) actually use the syntax

something\
 continuation

to ignore the newline (after the \ character) instead of inserting it.

How about word\<enter> meaning "word" input and word<enter> meaning "word\n" input? That has several advantages:

  1. regular and instruct mode don't have to have different <enter> handling
  2. it's mostly what user expects (when he hits enter, he sees a newline)
  3. the behavior of llama.cpp input is consistent with older versions

Other option would be to not insert the newline by default, but user would have to type out word\n<enter> input to get "word\n" to tokenizer.

@ScarletEmerald
Copy link
Author

@anzz1 Regarding implementation, I think any of these proposals should be relatively easy to implement in a portable way by modifying the loop at
https://github.com/ggerganov/llama.cpp/blob/5a5f8b1501fbb34367225544010ddfc306d6d2fe/examples/main/main.cpp#L383

I would not expect there would be any need to work with low-level keyboard input; just modifying line, another_line, and buffer should be enough.

@thement Regarding the desired behavior, the user may want to

  1. Insert a newline without returning control
  2. Return control without inserting a newline
  3. Insert a newline and then return control (1 followed by 2)

The current behavior is that word\<enter> does 1 and word<enter> does 3. We are missing 2.

My original proposal was to change word<enter> to instead do 2, at least in Interactive mode.

But, we could also do the other way around, where word\<enter> does 2 and word<enter> does 1. In other words, the user types as normal, inserting newlines as desired with <enter>, and then returns control when ready with \<enter> (which does not insert an additional newline).

Those are the only two options I see though which will give the user full control. Any scheme with an option to insert a newline and return control at the same time (case 3) won't be able to cover case 1 and case 2 without having two other input commands.

But perhaps a third input command is ok?
Perhaps word\<enter> would still be case 1, word<enter> would still be case 3, but we add say word/<enter> for case 2.

@gjmulder gjmulder added the enhancement New feature or request label Mar 30, 2023
@ScarletEmerald
Copy link
Author

After thinking about this some more, I think the desired behavior should be that if the user sends EOF (ctrl-Z on Windows or ctrl-D ctrl-D on Linux) then control should be returned without adding a newline. Everything else should remain as-is.

However, all my attempts to implement this have somehow broken the feature of using ctrl-C to interrupt generation. Either it no longer interrupts, or it causes the program to exit.

@jart
Copy link
Contributor

jart commented Apr 2, 2023

The way canonical mode works is the input line you're typing is buffered by the kernel. If ^D is pressed then standard input is closed and the program reacts by closing itself. If ^D is pressed after you've typed something on the line, then the driver delivers to the program without the trailing newline character. That means if I type hello^D then "hello" will be returned by read(), rather than it returning "hello\n" as it would normally, if I'd pressed ENTER instead.

To support this design, you'd have to get rid of std::getline(std::cin, line) in main.cpp and do the read calls manually. In order for this design to be simple and portable, you'd need to have a predetermined maximum line length, which should probably be PIPE_BUF bytes. In order to do better than that, you'd have to put the terminal in raw mode and believe me you don't want to implement that, because we'd be reluctant to accept a dependency just for this, and having it work on windows too is madness.

@thement
Copy link
Contributor

thement commented Apr 2, 2023

The current behavior is that word\<enter> does 1 and word<enter> does 3. We are missing 2.

Aha! I didn't know it behaves like that already. Introducing different behavior on word\<enter> would probably be confusing.

The way canonical mode works is the input line you're typing is buffered by the kernel. If ^D is pressed then standard input is closed and the program reacts by closing itself. If ^D is pressed after you've typed something on the line, then the driver delivers to the program without the trailing newline character.

Oh, I didn't know about that either, thanks! I always wondered why ^D wouldn't close the stdin if there were already some characters on the line.

This would be easy to implement on Linux (just one read call), but probably a nightmare on windows...

@github-actions github-actions bot added the stale label Mar 25, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

@ScarletEmerald
Copy link
Author

This issue was resolved by #1040

danielzgtg added a commit to danielzgtg/llama.cpp that referenced this issue Apr 16, 2025
This restores the behavior from ggml-org#491. This does not affect Ctrl+D's ability to
terminate --multiline-input lines (ggml-org#1040).

This also actually implements ggml-org#587: "If the user wants the text to end in a
newline, this should be accomplished by explicitly adding a newline by using
\ followed by return, then returning control by pressing return again."

Fixes ggml-org#12949
ngxson pushed a commit that referenced this issue Apr 18, 2025
This restores the behavior from #491. This does not affect Ctrl+D's ability to
terminate --multiline-input lines (#1040).

This also actually implements #587: "If the user wants the text to end in a
newline, this should be accomplished by explicitly adding a newline by using
\ followed by return, then returning control by pressing return again."

Fixes #12949
colout pushed a commit to colout/llama.cpp that referenced this issue Apr 21, 2025
This restores the behavior from ggml-org#491. This does not affect Ctrl+D's ability to
terminate --multiline-input lines (ggml-org#1040).

This also actually implements ggml-org#587: "If the user wants the text to end in a
newline, this should be accomplished by explicitly adding a newline by using
\ followed by return, then returning control by pressing return again."

Fixes ggml-org#12949
pockers21 pushed a commit to pockers21/llama.cpp that referenced this issue Apr 28, 2025
This restores the behavior from ggml-org#491. This does not affect Ctrl+D's ability to
terminate --multiline-input lines (ggml-org#1040).

This also actually implements ggml-org#587: "If the user wants the text to end in a
newline, this should be accomplished by explicitly adding a newline by using
\ followed by return, then returning control by pressing return again."

Fixes ggml-org#12949
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

5 participants