Skip to content

bad range check (rdma osc w/ odd datatypes) #3576

@markalle

Description

@markalle

On master MPI_Get with odd datatypes can trigger a range check error in osc_rdma_get_remote_segment(), eg

[1,0]<stdout>:remote address range 0x7eff7413035c - 0x7eff7414dedc is out of range. Valid address range is 0x7eff74046010 - 0x7eff74146825 (1050645 bytes)

osc_rdma_get_remote_segment() has the 3rd and 4th args as
* target_disp
* length
which it uses to determine if the rdma falls within the bounds of the window or not (actually it only checks the upper bound, but I'm okay with that).

Anyway the caller previously was passing in the length argument as
target_datatype->super.size * target_count
which which doesn't really represent the number of bytes after target_disp for which data exists. In particular I could create a datatype as
{ disp -4, len 4 } and use target_disp 4
and that would be bytes 0-3 of the window where the original code would think it was bytes 4-7 and could abort at the range check.

I'm making a pull request to use opal_datatype_span().

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions