AssertionError when trying to use tensor parallelism #747
Replies: 1 comment
-
Actually I found the solution, apparently I wasn't able to find the example in exllamav2 directory. It turns out I had the wrong cache type all along, when I should have used "ExLlamaV2Cache_TP". |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I'm having this issue where I can't make tensor parelism work on my pc. My system is running Ubuntu 22.04, two rtx 3090 and CUDA 12.04. I tryed several models but I had no luck so far. The one in this example is Qwen2.5 8.0 bpw, this is the code Im running:
The error message I get is:
I apologize beforehand if if I'm missing something trivial, I tryed to look for similar errors but couldn't find any. I'm not a coder so I have a lot of trouble figuring out what I'm doing wrong.
Beta Was this translation helpful? Give feedback.
All reactions