-
Notifications
You must be signed in to change notification settings - Fork 801
[SYCL] Defer shadow copy creation in SYCLMemObjT
#11348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
intel#10334 causes a performance regression since `HostPtr` cannot be reused when it is read-only. This change fixes the regression by deferring the copy operation to the creation of a writable accessor. Signed-off-by: Michael Aziz <[email protected]>
|
Could there be any race condition if we'd have two concurrent threads with different read/write properties? IIUC, there will be no UB as long as the actual accesses are all read-only. |
Signed-off-by: Michael Aziz <[email protected]>
Signed-off-by: Michael Aziz <[email protected]>
Signed-off-by: Michael Aziz <[email protected]>
Signed-off-by: Michael Aziz <[email protected]>
That's a good question. I'm not certain if my original implementation could cause race conditions so I've updated it. I moved the shadow copy creation to the |
Signed-off-by: Michael Aziz <[email protected]>
Signed-off-by: Michael Aziz <[email protected]>
steffenlarsen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! @aelovikov-intel - Would you like another look?
| if (MRecord != nullptr && MUserPtr != InitialUserPtr) { | ||
| for (auto &it : MRecord->MAllocaCommands) { | ||
| if (it->MMemAllocation == InitialUserPtr) { | ||
| it->MMemAllocation = MUserPtr; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nit to avoid a bit of indentation. If you prefer the current version I am also okay with sticking with it.
| if (MRecord != nullptr && MUserPtr != InitialUserPtr) { | |
| for (auto &it : MRecord->MAllocaCommands) { | |
| if (it->MMemAllocation == InitialUserPtr) { | |
| it->MMemAllocation = MUserPtr; | |
| } | |
| } | |
| } | |
| if (!MRecord || MUserPtr == InitialUserPtr) | |
| return; | |
| for (auto &it : MRecord->MAllocaCommands) | |
| if (it->MMemAllocation == InitialUserPtr) | |
| it->MMemAllocation = MUserPtr; |
I'm good with your review. @0x12CC , this needs a merge with conflicts resolution. |
Signed-off-by: Michael Aziz <[email protected]>
#10334 causes a performance regression since
HostPtrcan't be reused when it's read-only. This PR fixes the regression by deferring the copy operation to the creation of a writable accessor. It includes following the changes:SYCLMemObjT::MCreateShadowCopyto defer allocation. When theHostPtrcannot be reused since it's read-only,SYCLMemObjT::handleHostDatasets this member to a function that will allocate the shadow copy.SYCLMemObjT::handleWriteAccessorCreationmember function. This function callsSYCLMemObjT::MCreateShadowCopyand updates any existingMAllocaCommandsifMUserPtrchanged.handleWriteAccessorCreationgets called to ensure that any required memory allocation occurs.With this change, the allocation and copying overhead occurs during the creation of the first writable accessor. There's no overhead if all of the relevant accessors use
sycl::access_mode::read.