forked from ggml-org/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 500
Closed
Description
So I noticed it runs WAY slow, then realized my card was not set up for that, I am running ye oldie p40. So no tensor cores. But this fellow over at flash attention apparently made it possible to work without them? ggml-org#7188 I assume this in not implemented for this yet, any chance?
Metadata
Metadata
Assignees
Labels
No labels