-
Notifications
You must be signed in to change notification settings - Fork 900
ikrit spml cleanup, mkey cache and assorted bug fixes #2354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ikrit spml cleanup, mkey cache and assorted bug fixes #2354
Conversation
bot:mellanox:retest |
Signed-off-by: Alex Mikheev <[email protected]>
shmem_quit() shall complete all outstanding get_nbi() requests Signed-off-by: Alex Mikheev <[email protected]>
In this case there is no point to add another progress callback Signed-off-by: Alex Mikheev <[email protected]>
use single array instead of array of pointers Signed-off-by: Alex Mikheev <[email protected]>
Signed-off-by: Alex Mikheev <[email protected]>
check every possible transport Signed-off-by: Alex Mikheev <[email protected]>
Signed-off-by: Alex Mikheev <[email protected]>
Signed-off-by: Alex Mikheev <[email protected]>
Signed-off-by: Alex Mikheev <[email protected]>
It improves cpu cache hit ratio. Signed-off-by: Alex Mikheev <[email protected]>
@alex-mikheev test failure:
|
10601d9
to
8c92506
Compare
@@ -158,41 +162,97 @@ extern int mca_memheap_seg_cmp(const void *k, const void *v); | |||
|
|||
extern mca_memheap_map_t* memheap_map; | |||
|
|||
static inline int map_segment_is_va_in(map_base_segment_t *s, const void *va) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove const from "va" in all functions
@@ -158,41 +162,97 @@ extern int mca_memheap_seg_cmp(const void *k, const void *v); | |||
|
|||
extern mca_memheap_map_t* memheap_map; | |||
|
|||
static inline int map_segment_is_va_in(map_base_segment_t *s, const void *va) | |||
{ | |||
return ((uintptr_t)va >= (uintptr_t)s->va_base && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no needto cast to uintptr_t
(uintptr_t)va < (uintptr_t)s->va_end); | ||
} | ||
|
||
static inline map_segment_t *memheap_find_seg(const int segno) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no point making int parameter const
|
||
static inline int memheap_is_va_in_segment(const void *va, const int segno) | ||
{ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for space line here
|
||
static inline void* memheap_va2rva(const void* va, const void* local_base, const void* remote_base) | ||
{ | ||
return (void*) (remote_base > local_base ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to cast to uintptr_t
else { | ||
tr_id = i; | ||
} | ||
tr_id = i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could remove tr_id variable
|
||
OPAL_THREAD_LOCK(&oshmem_request_lock); | ||
assert(false == put_req->req_put.req_base.req_free_called); | ||
put_req->req_put.req_base.req_free_called = true; | ||
opal_free_list_return (&mca_spml_base_put_requests, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to take oshmem_request_lock now?
@@ -690,9 +491,10 @@ sshmem_mkey_t *mca_spml_ikrit_register(void* addr, | |||
} | |||
SPML_VERBOSE(5, | |||
"rank %d ptl %d addr %p size %llu %s", | |||
oshmem_proc_pe(oshmem_proc_local()), i, addr, (unsigned long long)size, | |||
my_rank, i, addr, (unsigned long long)size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation uses tabs, need spaces
if (handle) | ||
*handle = put_req; | ||
|
||
/* fill out request */ | ||
put_req->mxm_req.base.mq = mca_spml_ikrit.mxm_mq; | ||
/* request immediate responce if we are getting low on send buffers. We only get responce from remote on ack timeout. | ||
* Also request explicit ack once in a while */ | ||
#if MXM_API < MXM_VERSION(2,0) | ||
#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove the code
opal_progress(); | ||
} | ||
|
||
while (0 < mca_spml_ikrit.n_active_gets) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can combine the two while loops with &&
bot:ibm:retest |
@jjhursey Are these false alarms? Please advise. |
bot:ibm:retest |
Disregard the IBM-CI. The relay machine failed unexpectedly this afternoon. I'll take down the Jenkins until it is back up. |
Signed-off-by: Alex Mikheev <[email protected]>
Signed-off-by: Alex Mikheev <[email protected]>
8c92506
to
bf61961
Compare
@jsquyres i think this can be merged |
* Call opal_progress() so that ompi will no deadlock | ||
* (for example may need to respond to rkey requests) | ||
*/ | ||
opal_progress(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alex-mikheev I don't think this is a right place for opal_progress. The function is not always invoked from a loop. The opal_progress has to be moved to the while loop that invokes shmem_lock_cswap.
removing rm approved from this so the ucx/mlnx folks can sort this out. |
Signed-off-by: Alex Mikheev <[email protected]>
Current standard says that behaviour in the case of error is undefined Signed-off-by: Alex Mikheev <[email protected]>
@shamisp please take a look at my last two commits |
@hppritcha Looks good to me. |
👍 |
@yosefe @igor-ivanov please review