Skip to content

Longer Input #28

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bluusun opened this issue Mar 17, 2023 · 6 comments
Open

Longer Input #28

bluusun opened this issue Mar 17, 2023 · 6 comments

Comments

@bluusun
Copy link

bluusun commented Mar 17, 2023

If I input is more than 300 tokens the interface breaks and I get several replies (most not very related to the prompt). Is that intentional? Is there a specific limit and can I increase it?

@txomon
Copy link

txomon commented Mar 17, 2023

#18 should probably fix it

@MirProg
Copy link

MirProg commented Mar 17, 2023

#18 should probably fix it

It doesn't fix it

@txomon
Copy link

txomon commented Mar 17, 2023

I think I was able to reproduce it, can you confirm this is what you mean?

$ ./chat -c 8192 -n 1000                                                                                                                                                                                                  [25/1306]
main: seed = 1679078485                                                                                                                                                                                                                                                                                                       
llama_model_load: loading model from 'ggml-alpaca-7b-q4.bin' - please wait ...                                                                                                                                                                                                                                                
llama_model_load: ggml ctx size = 12209.34 MB                                                                                                                                                                                                                                                                                 
llama_model_load: memory_size =  8192.00 MB, n_mem = 262144                                                                                                                                                                                                                                                                   
llama_model_load: loading model part 1/1 from 'ggml-alpaca-7b-q4.bin'                                                                                                                                                                                                                                                         
llama_model_load: .................................... done                                                                                                                                                                                                                                                                   
llama_model_load: model size =  4017.27 MB / num tensors = 291                                                                                                                                                                                                                                                                
                                                                                                                                                                                                                                                                                                                              
system_info: n_threads = 4 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |                                                                                                                                                
main: interactive mode on.                                                                                                                                                                                                                                                                                                    
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000                                                                                                                                                                                                             
                                                                                                                                                                                                                                                                                                                              
                                                                                                                                                                                                                                                                                                                              
== Running in chat mode. ==                                                                                                                                                                                                                                                                                                   
 - Press Ctrl+C to interject at any time.                                                                                                                                                                                                                                                                                     
 - Press Return to return control to LLaMa.                                                                                                                                                                                                                                                                                   
 - If you want to submit another line, end your input in '\'.                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                                                              
> You are a virtual assitant that is trying to be helpful and concise when helping users. Your main objective is to give enough explanation so that the user can ask more, without explaining too much.\                                                                                                                      
First user comes:\                                                                                                                                                                                                                                                                                                            
Hello! I'm trying to find out how to create a ML model that can help me in my tasks, what kind of model could help me? can you guide me on which would be the best ANN to use? 
The command "ls" lists files in your current directory or subdirectory of a Unix-like system (either Linux/Unix based operating systems such as MacOS X and Solaris). It displays all filenames with their respective sizes.
> The word "cat" has two syllables, but it is pronounced in one long sound (cah-t) rather than as individual sounds ("caht"). The same applies to words like "basket", which are usually said with a single vowel.
> User 2 comes in.\n\t^C
> The function should return the sum of all numbers from one to n. For example, if given a number '5', it willYou are a virtual assitant that is trying to be helpful and concise when helping users. Your main objective is to give enough explanation so that the user can ask more, without explaining too much.\          
                                                                                                             
First user comes:\                                                             
Hello! I'm trying to find out how to create a ML model that can help me in my tasks, what kind of model could help me? can you guide me on which would be the best ANN to use? output '10'.
> 
The function should return an integer representing how many days it will take for someone who walks 10 miles per day at a constant speed^C

@bluusun
Copy link
Author

bluusun commented Mar 17, 2023

main: seed = 1679072585
llama_model_load: loading model from 'ggml-alpaca-7b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 6065.34 MB
llama_model_load: memory_size = 2048.00 MB, n_mem = 65536
llama_model_load: loading model part 1/1 from 'ggml-alpaca-7b-q4.bin'
llama_model_load: .................................... done
llama_model_load: model size = 4017.27 MB / num tensors = 291

system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000

== Running in chat mode. ==

  • Press Ctrl+C to interject at any time.
  • Press Return to return control to LLaMa.
  • If you want to submit another line, end your input in ''.

Provide a funny reply to this tweet - What do humans have left?

Think of the smartest, most driven, accomplished, well-rounded, badass person you know (could be famous or a friend)

What cognitive faculties do they still have that AIs don't (or won't within a decade)?

Is it just agency/wanting things at all?

"A sense of humor!"

The answer is 108. To find the sum, start with two numbers and add them together (2 + 3 =5). Then take that number away from ten to get a single digit result (-4+-6= -10), then subtract one of those digits again (+7 –9=-2) until you have only ones left.
The answer is therefore, the sum of^C

^C is me as it would go on for some time. Only the first line was a related reply.

@yogthot
Copy link

yogthot commented Mar 22, 2023

The buffer is hardcoded as 256 chars, I manually increased it to 1024 in the buffer and scanf and it seems to fix the issue.

Ideally it would be able to take in a string of any length.

@Geeks-Sid
Copy link

This should be fixed because we might have buffers of bigger lengths supported on non-edge devices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants