-
Notifications
You must be signed in to change notification settings - Fork 19.2k
Closed as not planned
Labels
bugRelated to a bug, vulnerability, unexpected error with an existing featureRelated to a bug, vulnerability, unexpected error with an existing feature
Description
System Info
LangChain 0.0.242
Who can help?
No response
Information
- The official example notebooks/scripts
- My own modified scripts
Related Components
- LLMs/Chat Models
- Embedding Models
- Prompts / Prompt Templates / Prompt Selectors
- Output Parsers
- Document Loaders
- Vector Stores / Retrievers
- Memory
- Agents / Agent Executors
- Tools / Toolkits
- Chains
- Callbacks/Tracing
- Async
Reproduction
tried to load https://huggingface.co/TheBloke/StableBeluga2-70B-GGML model to work with lanchain's Llamacpp :
llm = LlamaCpp(model_path="./stablebeluga2-70b.ggmlv3.q4_0.bin", n_gpu_layers=n_gpu_layers, n_batch=n_batch, n_ctx=8192, input={"temperature": 0.01},n_threads=8) llm_chain = LLMChain(llm=llm, prompt=prompt)
i see that there is no support in passing n_gqa=8 parameter that according to https://github.com/abetlen/llama-cpp-python should be used for 70B models
the error i get is :
error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024
Expected behavior
model should be loaded successfuly
zacps and william-incode
Metadata
Metadata
Assignees
Labels
bugRelated to a bug, vulnerability, unexpected error with an existing featureRelated to a bug, vulnerability, unexpected error with an existing feature