-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Separated context and state for easier parallelization. #494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separated context and state for easier parallelization. #494
Conversation
std::vector<float> probs; | ||
std::vector<float> logits; | ||
std::vector<float> logprobs; | ||
std::vector<float> probs{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fought with some read access denied exception for almost 2h, only to find out that it's a bug in compiler (only for debug): microsoft/STL#1934
The workaround is just to initialize the struct (otherwise, it's pretty hard to debug)
@ggerganov please take a look when you have some time. It will be pretty hard to keep this branch and resolve conflicts if other changes are done. |
@@ -10,20 +10,20 @@ | |||
|
|||
constexpr int N_THREAD = 8; | |||
|
|||
// TODO: get rid of this vector of contexts - bad idea in the first place |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed all these todos and kept only 1 context needed + a vector of states.
Hello @RndyP , About that spinlock, I agree it would be nice to change it. Can you, maybe create another issue about your findings? |
I reported this in issue #300, which has fallen down the list. I don't understand the code enough to suggest a fix myself. static LONG atomic_load(atomic_int* ptr) { I believe InterlockedCompareExchange() is simply locking the data values, and the while() is simply spinning and waiting for the value to change. |
@sandrohanea Don't worry about keeping up-to-date - I will be able to do that if necessary. |
Thanks a lot for taking the time, it totally makes sense. Also, thanks again for creating the whole library in the first place, really titanic work to port all tensor operations and everything. I was thinking also about keeping a "default state" in the context and use that if no different state is provided, this way it won't be a breaking change to existing functionality. On the other hand, it is really easy to use it wrong if context is thread safe "sometimes". It's your call on this one. If you have any idea how this could be done better, please, let me know and I'll be happy to help. |
Closing this as the alternative with opt in state was merged. |
#475
whisper_full_with_state
instead ofwhisper_full
so, they can read the results from the state at the end of transformation.Note:
In order for the bindings to be able to use same context with multiple different transformations (with a state for each transformation) in parallel, some adjustments are needed, but the default cases are already working.