-
Notifications
You must be signed in to change notification settings - Fork 1
main: use seperate stream for control characters #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
main: use seperate stream for control characters #4
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% sure about the overall approach yet, but the implementation looks basically sound aside from the comments.
Fixed @mrdomino ready for you to test |
if (!params.conversation && sparams.grammar.empty()) | ||
{ | ||
// Stream Control Token To Standard Output as long as we are not in a conversation or grammar output | ||
fprintf(stdout, "%s", token_str.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stdout is buffered, so if you are going to be using it, you will need to make sure fflush(stdout)
is called before every write to the token fd. Otherwise the tokens will not interleave correctly.
You could also switch to calling write on fd 1 and make the output fully unbuffered, but I think this would be suboptimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added fflush(stdout) before every fprintf() now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't you flush after each print?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jart my theory was that you'd want to take advantage of the buffering. If you don't care about buffering output then absolutely, or even setbuf(stdout, NULL)
at the start of the program.
ETA: the other thing is that if you have anything else doing anything to stdout it can mess with the interleaving... it just seemed simpler to me to constrain it to where you were doing the token write
6c3890f
to
5032f18
Compare
@mrdomino thanks for the command string you used to test. This took a while, but I realised now that the C file open and bash are using the same file descriptor... so we really actually needed to check right at the start of the program to fully check if the file descriptor was already claimed by bash. Anyway this is the full test process to prove that this is behaving as I was expecting, you may want to probe it a bit further, but I'm a bit more confident now that I know what I'm doing now here. cmake -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build
echo "== Expect Control Token To Shared Console 3>&1 =="
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null 3>&1
echo
echo "== Expect No Control Token To Console because 3>/dev/null =="
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null 3>/dev/null
echo
echo "== Expect No Control Token To Console because 3>&- =="
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null 3>&-
echo
echo == Expect No Control Token To Console as we are still in grammar mode ==
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null
echo
echo == Expect Control Token To Console as we are in normal completion mode ==
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Hi
<|assistant|>" 2>/dev/null 3>&1
echo Which would result in this output
|
if (!params.conversation && sparams.grammar.empty()) | ||
{ | ||
// Stream Control Token To Standard Output as long as we are not in a conversation or grammar output | ||
fprintf(stdout, "%s", token_str.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't you flush after each print?
86b9e96
to
8f76ba5
Compare
#ifndef _MSC_VER | ||
if (control_token_file_descriptor_is_attached) { | ||
// Stream Control Token To Special Token Output. Useful for debugging control token behaviour | ||
(void)! write(SPECIAL_FILENO, token_str.c_str(), token_str.length()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extraneous !
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also I'm not sure why you'd have (void)
before write
but not fprintf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intentional or it breaks make
command
(void) write(SPECIAL_FILENO, token_str.c_str(), token_str.length());
examples/main/main.cpp: In function ‘int main(int, char**)’:
examples/main/main.cpp:767:37: warning: ignoring return value of ‘ssize_t write(int, const void*, size_t)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
767 | (void) write(SPECIAL_FILENO, token_str.c_str(), token_str.length());
| ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, apparently gcc and glibc have collaborated to do an extremely dumb thing. fine, keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context: https://bugs.llvm.org/show_bug.cgi?id=51228
Context: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425
A possible appproach according to @mrdomino
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-result"
write(...);
#pragma GCC diagnostic pop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
Stab at this idea for ggml-org#6923