-
Notifications
You must be signed in to change notification settings - Fork 97
Add CSR merge_path SpMV for OpenMP backend #1810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I have mostly comments left on the documentation.
omp/matrix/csr_kernels.cpp
Outdated
@@ -42,12 +42,133 @@ namespace omp { | |||
namespace csr { | |||
|
|||
|
|||
/** | |||
* Computes the begin offsets into A and B for the specific diagonal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rephrase this? It's unclear what A
, and B
are, as well as what the specific diagonal
is.
omp/matrix/csr_kernels.cpp
Outdated
* Computes the begin offsets into A and B for the specific diagonal | ||
* | ||
* @param diagonal the diagonal to search | ||
* @param end_row_offsets the ending of row offsets of A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also unclear. Is this the pointer to the end of the row offsets for A?
omp/matrix/csr_kernels.cpp
Outdated
const auto nnz = static_cast<IndexType>(a->get_num_stored_elements()); | ||
const auto num_threads = static_cast<IndexType>(omp_get_max_threads()); | ||
// Merge list A: row end offsets | ||
const IndexType* row_end_offsets = row_ptrs + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the suffix _offsets
necessary? How you define it, I would just call it row_ends
or maybe row_end_idxs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think offsets matches the meaning not idxs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use row_end_ptrs to match the same term we use in CSR
auto row_carry_out_ptr = row_carry_out.get_data(); | ||
auto value_carry_out_ptr = value_carry_out.get_data(); | ||
|
||
// TODO: parallelize with number of cols, too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to, I think it's pretty simple to move this loop to the innermost loop. You just have to increase the size of value_carry_out
to num_threads * c->get_size()[1]
.
0707e9f
to
f5f7dfd
Compare
Co-authored-by: Marcel Koch <[email protected]>
f5f7dfd
to
0e09c20
Compare
|
This PR picks the changes from #1497 by Luka Stanisic (@stanisic), applies the comments, and makes it available with advanced_spmv