Skip to content

Use BLAS acceleration in .dot() when possible #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 28, 2016
Merged

Conversation

bluss
Copy link
Member

@bluss bluss commented Feb 28, 2016

Use BLAS to compute the dot product when possible (for f32 and f64).

Even though our manual scalar product implementation is pretty good, the BLAS implementation still manages to beat it and finish the 1024 element dot product twice as fast. So this is an improvement even in the simplest kinds of cases. (But we still use the generic dot for vectors shorter than 32 elements, since that is faster).

  • Introduce an internal module imp_prelude (implementation's prelude) that simplifies importing the main types that we use everywhere.
  • Add type Priv that can be used to attach private methods that can still be used everywhere in the crate.
  • Deprecate ArrayBase::zeros (stray extra change).

Simplify based on the new restriction (Copy).
For some measure of long, seems like the smallest vectors benefit from
using the plain generic dot product (32 elements or smaller).
bluss added a commit that referenced this pull request Feb 28, 2016
Use BLAS acceleration in .dot() when possible
@bluss bluss merged commit 9bae4eb into master Feb 28, 2016
@bluss bluss deleted the specialize-dot branch February 28, 2016 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant