Skip to content

Do not recreate context while LLama is writing #828

Closed
@janekb04

Description

@janekb04

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Tokens are generated at about a constant rate, ie. N tokens per second on a given machine.

Current Behavior

Sometimes, the LLM takes a much longer time to generate a token than usually. It can be a 10x slowdown.

Environment and Context

Setup
MacBook Pro 14-inch 2021
10-core Apple M1 Pro CPU
16 GB RAM
OS
MacOS Ventura 13.3 (22E252)
clang --version

Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin22.4.0
Thread model: posix

Steps to Reproduce

Run ./main -m ./models/ggml-vicuna-7b-4bit-rev1.bin -n 512 --color -f prompts/chat-with-vicuna.txt --seed 42 --mlock

The model will get stuck after "of": ...or visit one of▏ the city's many restaurants...

Failure Logs

main: seed = 42
llama_model_load: loading model from './models/ggml-vicuna-7b-4bit-rev1.bin' - please wait ...
llama_model_load: n_vocab = 32001
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml map size = 4017.70 MB
llama_model_load: ggml ctx size =  81.25 KB
llama_model_load: mem required  = 5809.78 MB (+ 1026.00 MB per state)
llama_model_load: loading tensors from './models/ggml-vicuna-7b-4bit-rev1.bin'
llama_model_load: model size =  4017.27 MB / num tensors = 291
llama_init_from_file: kv self size  =  256.00 MB

system_info: n_threads = 4 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
sampling: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.100000
generate: n_ctx = 512, n_batch = 8, n_predict = 512, n_keep = 0


 A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.

### Human: Hello, Assistant.
### Assistant: Hello. How may I help you today?
### Human: Please tell me the largest city in Europe.
### Assistant: Sure. The largest city in Europe is Moscow, the capital of Russia.
### Human: Write a description of it for tourists
### Assistant: Welcome to Moscow! Located in the heart of Europe, this vibrant city is full of history, culture, and things to do. Whether you're interested in art, architecture, or outdoor activities, there's something for everyone in Moscow. Start your visit at Red Square, home to some of the city's most famous landmarks, including St. Basil's Cathedral and the Kremlin. Take a stroll through the historic neighborhood of Kitai-Gorod, where you can find plenty of shops and restaurants. Visit the Tretyakov Gallery to see some of Russia's most famous artwork, or take a trip to the outskirts of the city to explore the beautiful parks and gardens. Don't forget to try some of Moscow's delicious local cuisine, including borscht (beet soup) and balalaika (a type of stringed instrument).
### Human: No, please write about Amsterdam as if you are a tourist guide
### Assistant: Welcome to Amsterdam, the vibrant capital of the Netherlands! Known for its iconic canals, bustling nightlife, and liberal culture, this city is a must-visit destination for any traveler. Start your visit at Dam Square, home to some of Amsterdam's most famous landmarks, including the Royal Palace, the National Monument, and the New Church. Take a stroll along the canals, which are lined with charming homes, cafes, and shops. Visit the Van Gogh Museum to see the largest collection of Vincent van Gogh's paintings and letters in the world. Or take a boat tour of the city's many canals and historical sites. Don't forget to try some of Amsterdam's famous street food, such as pancakes and waffles, or visit one of the city's many restaurants for a taste of the local cuisine. Amsterdam is also known for its lively nightlife, with trendy bars, clubs, and coffee shops galore. Don't be afraid to explore the city's Red Light District, a colorful and historic area that has been a part of Amsterdam since the Middle Ages. Overall, Amsterdam is a fascinating and unique destination that offers something for everyone.
### Human: No, please write about New York City as if
llama_print_timings:        load time = 22203.17 ms
llama_print_timings:      sample time =   372.37 ms /   512 runs   (    0.73 ms per run)
llama_print_timings: prompt eval time = 14223.05 ms /   368 tokens (   38.65 ms per token)
llama_print_timings:        eval time = 43922.75 ms /   510 runs   (   86.12 ms per run)
llama_print_timings:       total time = 80063.80 ms

Video

ezgif com-optimize

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions