[SYCL] Reduce the time to get a kernel from cache #4186

alexanderfle · 2021-07-26T14:58:15Z

The main idea of this patch is to fast get information from the cache.

Before this patch:

Regular cache requires too many syncs and many searches across multiple maps to get the information.

At first, regular cache requires to get program from cache by getBuiltPIProgram: in most cases, it requires 3 searches in getKernelSetId then under mutex get the program from the cache, checking atomic variable whether it is already built or in the progress. Then using the program under mutex, doing 2 searches, get the stored info from the cache, also checking the atomic variable whether it is already built or in the progress.

After this patch:

For fast cache, it is required under mutex to make 1 search, and if the info is not found it means that the info has not been prepared yet, and then you will need to follow the standard more expensive way.

In addition, the fast cache keeps the program close to the kernel to return the program and reduce search time spending in one PI call in the most common case in commands.cpp.

Signed-off-by: Alexander Flegontov [email protected]

Signed-off-by: Alexander Flegontov <[email protected]>

alexanderfle · 2021-07-26T16:08:13Z

/summary:run

sycl/source/detail/kernel_program_cache.hpp

s-kanaev · 2021-07-30T19:31:45Z

@alexanderfle , could you, please, elaborate in patch description on the idea employed in this patch to enhance run-time of cache operations?

sycl/source/detail/kernel_program_cache.hpp

s-kanaev · 2021-08-02T14:38:07Z

For fast cache, it is required under mutex to make 1 search, and if the info is not found it means that the info has not been prepared yet, and then you will need to follow the standard more expensive way.

Is the following sequence correct here upon request to get a kernel?

Lock fast cache mutex
Check if data available in fast cache:
2.1 If it is, return the data. FINISH
2.2 If not, proceed to p.3
Unlock fast cache mutex
Create data bits
Lock regular cache mutex(-es)
Insert data into regular cache
Lock fast cache mutex
Insert data into fast cache
Unlock fast cache mutex
Unlock regular cache mutex(-es)
Return data bits created in p.4

alexanderfle · 2021-08-03T17:20:10Z

Is the following sequence correct here upon request to get a kernel?

No, because the implementation of that requires changes getOrBuild() because the fast cache should store also PiProgram. And, currently, PiProgram is unknown in getOrBuild() in the case when we call getOrBuild() to get the kernel.

Signed-off-by: Alexander Flegontov <[email protected]>

…erging Signed-off-by: Alexander Flegontov <[email protected]>

alexanderfle · 2021-08-05T09:10:41Z

@s-kanaev , ping

s-kanaev

The changes look good.
Though, I believe, we need tests for fast cache here.

Also, this sequence involves two mutexes:

Lock fast cache mutex
Check if data available in fast cache:
2.1 If it is, return the data. FINISH + Unlock fast cache mutex
2.2 If not, proceed to p.3
Unlock fast cache mutex
Create data bits
Lock regular cache mutex(-es)
Insert data into regular cache
Unlock regular cache mutex(-es)
Lock fast cache mutex
Insert data into fast cache
Unlock fast cache mutex
Return data bits created in p.4

Hence, we need a thread safety test also.

Signed-off-by: Alexander Flegontov <[email protected]>

s-kanaev

LGTM

alexanderfle · 2021-08-11T14:20:39Z

@bader, ping

bader · 2021-08-11T14:21:54Z

@bader, ping

@alexanderfle, pong?

alexanderfle · 2021-08-11T14:41:36Z

@bader, no:) I mean, could please have a look at the PR as only authorized user can handle this.

bader · 2021-08-11T14:50:10Z

could please have a look at the PR as only authorized user can handle this.

What do you mean by "handle this"?

alexanderfle · 2021-08-11T15:13:15Z

What do you mean by "handle this"?

It means: handle the merge

bader · 2021-08-11T15:20:39Z

could please have a look at the PR as only authorized user can handle this.

What do you mean by "handle this"?

It means: handle the merge

FYI: There are at least 8 people who can handle the merge (including @againull, who is assigned to review this pull request).
E.g. #4272 is merged by @vladimirlaz and #4281 is merged by @romanovvlad.

alexanderfle · 2021-08-11T15:33:43Z

FYI: There are at least 8 people who can handle the merge (including @againull, who is assigned to review this pull request).
E.g. #4272 is merged by @vladimirlaz and #4281 is merged by @romanovvlad.

ok, I see.

alexanderfle · 2021-08-11T15:34:10Z

@againull, could you take a look, please? Please, merge if it looks okay to you.

againull

LGTM

[SYCL] Reduce the time to get a kernel from cache

4b2d188

Signed-off-by: Alexander Flegontov <[email protected]>

alexanderfle marked this pull request as ready for review July 27, 2021 15:58

alexanderfle requested a review from a team as a code owner July 27, 2021 15:58

alexanderfle requested a review from againull July 27, 2021 15:58

s-kanaev reviewed Jul 30, 2021

View reviewed changes

sycl/source/detail/kernel_program_cache.hpp Show resolved Hide resolved

againull reviewed Jul 30, 2021

View reviewed changes

sycl/source/detail/kernel_program_cache.hpp Show resolved Hide resolved

alexanderfle requested review from s-kanaev and againull August 2, 2021 11:05

alexanderfle added 3 commits August 3, 2021 21:05

[SYCL] Merge branch 'sycl' into fast_get_kernel_from_cache

9372df6

Signed-off-by: Alexander Flegontov <[email protected]>

[SYCL] Fix clang-format after merge

0579eaa

Signed-off-by: Alexander Flegontov <[email protected]>

[SYCL] Resolve build issue that came with partly-auto/partly-manual m…

c8c4d07

…erging Signed-off-by: Alexander Flegontov <[email protected]>

s-kanaev suggested changes Aug 10, 2021

View reviewed changes

[SYCL] Add tests for fastcache

1e86e96

Signed-off-by: Alexander Flegontov <[email protected]>

alexanderfle requested a review from s-kanaev August 11, 2021 13:46

s-kanaev approved these changes Aug 11, 2021

View reviewed changes

againull approved these changes Aug 11, 2021

View reviewed changes

againull merged commit c16705a into intel:sycl Aug 11, 2021

[SYCL] Reduce the time to get a kernel from cache #4186

[SYCL] Reduce the time to get a kernel from cache #4186

Uh oh!

Conversation

alexanderfle commented Jul 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexanderfle commented Jul 26, 2021

Uh oh!

Uh oh!

s-kanaev commented Jul 30, 2021

Uh oh!

Uh oh!

s-kanaev commented Aug 2, 2021

Uh oh!

alexanderfle commented Aug 3, 2021

Uh oh!

alexanderfle commented Aug 5, 2021

Uh oh!

s-kanaev left a comment

Choose a reason for hiding this comment

Uh oh!

s-kanaev left a comment

Choose a reason for hiding this comment

Uh oh!

alexanderfle commented Aug 11, 2021

Uh oh!

bader commented Aug 11, 2021

Uh oh!

alexanderfle commented Aug 11, 2021

Uh oh!

bader commented Aug 11, 2021

Uh oh!

alexanderfle commented Aug 11, 2021

Uh oh!

bader commented Aug 11, 2021

Uh oh!

alexanderfle commented Aug 11, 2021

Uh oh!

alexanderfle commented Aug 11, 2021

Uh oh!

againull left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alexanderfle commented Jul 26, 2021 •

edited

Loading