-
Notifications
You must be signed in to change notification settings - Fork 31
Lockfree Skip List #65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Sudha247
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work @sooraj-srini! This is going in the right direction. I did a quick pass and left some comments below. Will try to do another pass on add and remove functions.
src/atomicskiplist.ml
Outdated
| (** Get a random level from 1 till max_height (both included) *) | ||
| let get_random_level () = | ||
| let rec count_level cur_level = | ||
| if cur_level == max_height || Random.float 1.0 <= 0.5 then cur_level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use Random.bool ()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, instead of using a loop, one could roughly:
- Compute
max_height - 1random bits:Random.bits () land ((1 lsl (max_height - 1)) - 1) lor (1 lsl (max_height - 1)). - Use the technique described in this paper to find the index of the lowest 1 bit.
- Add 1.
src/atomicskiplist.ml
Outdated
| (** get_mark_ref: Returns the node and the mark from an Atomic markablereference *) | ||
| let get_mark_ref atomic_ref = | ||
| let ref = Atomic.get atomic_ref in | ||
| (ref.node, ref.marked) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be unnecessary to copy the values from the immutable record. This whole helper function could be just removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Java implementation of AtomicMarkableReference includes functions like getReference and get which are analogous to get_ref and get_mark_ref in my implementation. In my aim to stick as close as possible to the implementation in "The Art of Multiprocessor Programming", I used similar functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that is a reasonable approach to get a working reference implementation. However, Java and OCaml have many differences that make a direct transliteration from Java to OCaml undesirable. For example, using options to emulate null is inefficient, because in OCaml that adds a level of indirection. I would not recommend such an approach (i.e. direct transliteration) for a serious implementation.
src/atomicskiplist.ml
Outdated
| let init level = | ||
| let prev = Some head in | ||
| let curr = get_ref (Option.get prev).next.(level) in | ||
| let succ, mark = get_mark_ref (Option.get curr).next.(level) in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be just let {node = succ; marked = mark} = Atomic.get (Option.get curr).next.(level).
Also, Option.get should generally be avoided.
|
@sooraj-srini will finish his course project in 2 weeks. I wondered what needs to be done to get this across the line to a merge while we have his attention. @Sudha247 @polytypic? |
lyrm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a primary review (I did not dive deep into the algorithm).
It is a great start : the implementation seems to work. I have done STM test (that I will add to the #61). It works. The single dscheck test takes a very long time but finished. I will add some and run them through a night to see if any issue raises.
I however think quite a bit of things need to be polished before merging: the code needs to be cleaned and there is improvement to do to better use Ocaml features, as @polytypic mentioned.
In a general way:
- formatting (
dune build @fmtand thendune promote) - documentation in
.mli
About the code itself, I wrote a few comments, but there seems to be a lot of avoidable copies and indirections. I will try to dive deeper into it at the beginning of next week to help move it forward quickly.
src/atomicskiplist.ml
Outdated
|
|
||
| let null_node = {key = Int.max_int; height = 0; next = [||]} | ||
|
|
||
| let max_height = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be an optional parameter of the create function as it is related to how well the skip list performs compared to its size. It also allows dscheck to finish by setting it at a low value.
| @@ -0,0 +1,28 @@ | |||
| (* This dscheck testcase is not terminating. *) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dscheck test takes forever (probably because of too many possible interleavings) but it can actually finish.
It also displays a weird behavior (if find calls are replaced by add calls) which is due to randomness (as explained in this dscheck issue). It is easily avoided by adding Random.init 0 at the beginning of the test (right after Atomic.trace (fun () ->).
lyrm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more complete review of the code. A few important things to do :
- add
max_heightas a (possibly optional) parameter ofcreate. - remove the
get_mark_refandget_reffunctions - minor changes in the
addfunction to avoid callingAtomic.geton a newly created and still unlinked node - rename
findtomem - either correct the types (as it is an integer skip list) or add the necessary changes to make it a polymorphic one.
|
With the proper changes, this following open Atomicskiplist
let _two_find () =
Atomic.trace (fun () ->
Random.init 0;
let sl = create ~max_height:2 () in
let added1 = ref false in
let found1 = ref false in
let found2 = ref false in
Atomic.spawn (fun () ->
added1 := add sl 1;
found1 := find sl 1);
Atomic.spawn (fun () -> found2 := find sl 2);
Atomic.final (fun () ->
Atomic.check (fun () -> !added1 && !found1 && not !found2)))
let _two_add () =
Atomic.trace (fun () ->
Random.init 0;
let sl = Atomicskiplist.create ~max_height:2 () in
let added1 = ref false in
let added2 = ref false in
Atomic.spawn (fun () -> added1 := add sl 1);
Atomic.spawn (fun () -> added2 := add sl 2);
Atomic.final (fun () ->
Atomic.check (fun () -> !added1 && !added2 && find sl 1 && find sl 2)))
let _two_add_same () =
Atomic.trace (fun () ->
Random.init 0;
let sl = Atomicskiplist.create ~max_height:2 () in
let added1 = ref false in
let added2 = ref false in
Atomic.spawn (fun () -> added1 := add sl 1);
Atomic.spawn (fun () -> added2 := add sl 1);
Atomic.final (fun () ->
Atomic.check (fun () ->
(!added1 && not !added2)
|| (((not !added1) && !added2) && find sl 1))))
let _two_remove_same () =
Atomic.trace (fun () ->
Random.init 0;
let sl = create ~max_height:1 () in
let added1 = ref false in
let removed1 = ref false in
let removed2 = ref false in
Atomic.spawn (fun () ->
added1 := add sl 1;
removed1 := remove sl 1);
Atomic.spawn (fun () -> removed2 := remove sl 1);
Atomic.final (fun () ->
Atomic.check (fun () ->
!added1
&& ((!removed1 && not !removed2) || ((not !removed1) && !removed2))
&& not (find sl 1))))
let _two_remove () =
Atomic.trace (fun () ->
Random.init 0;
let sl = create ~max_height:1 () in
let added1 = ref false in
let removed1 = ref false in
let removed2 = ref false in
Atomic.spawn (fun () ->
added1 := add sl 1;
removed1 := remove sl 1);
Atomic.spawn (fun () -> removed2 := remove sl 2);
Atomic.final (fun () ->
Atomic.check (fun () ->
let found1 = find sl 1 in
!added1 && !removed1 && not !removed2 && not found1)))
let () =
let open Alcotest in
run "atomic_skiplist_dscheck"
[
( "basic",
[
test_case "2-find" `Slow _two_find;
test_case "2-add-same" `Slow _two_add_same;
test_case "2-add" `Slow _two_add;
test_case "2-remove-same" `Slow _two_remove_same;
test_case "2-remove" `Slow _two_remove;
] );
]
|
|
Hi @sooraj-srini, I believe @lyrm has some updates to this PR before we merge. Is it ok to push updates directly? If so, could you give access to @lyrm to your fork please? |
|
Sure, I have added @lyrm to my fork as a collaborator. |
|
(Sorry, some hmmm git shenanigans) |
|
I have merged all the small changes/improvements/debugs I did on this implementation. There are still some stuff that can be done, but nothing that should change massively the implementation. I will list the tasks to do (in this PR or in a future one) in the first comment. @polytypic : Could you review this ? I guess at this point, I am mostly interested on optimization/improvement that could be done in this algorithm. In particular, I think the implementation suffer a lot from false sharing, as the |
| if prob < add then Skiplist.add sl (Random.int 10000) |> ignore | ||
| else if prob >= add && prob < add +. remove then | ||
| Skiplist.remove sl (Random.int 10000) |> ignore | ||
| else Skiplist.mem sl elems.(i) |> ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... The use of elems array here and above is a bit strange. A huge array is initialized and then only the fraction for a single thread is used. Why not use a Random.int here as well?
| List.map (fun domain -> Domain.join domain) threads | ||
| in | ||
| let end_time = Unix.gettimeofday () in | ||
| let time_diff = end_time -. List.nth start_time_threads 0 in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... The timing collection seems to just use the start time of the first domain and the time after joining all of the domains.
How about:
- Use a barrier to synchronize all the domains before their loops.
- Individually time the loop inside each domain and return that from each domain.
Various measures could then be calculated from the collection of timings. E.g. compute average of the times from each domain to get roughly the same kind of measurement as here, but taking all domains into account rather than the start time of the first domain and the end time (+ some) of the domain that finished last.
| Atomic.make mark_ref) | ||
| succs | ||
| in | ||
| let new_node = { key; height = top_level; next = new_node_next } in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next array of the new node is initialized to contain as many elements as the maximum number of levels, but only a part (according to height) of those are actually used.
| (fun element -> | ||
| let mark_ref = { node = element; marked = false } in | ||
| Atomic.make mark_ref) | ||
| succs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... Aside from the array created here being too large, I believe there might be a more subtle space leak issue.
What happens here is that a new node is being constructed and as a part of that an array of references to other nodes is created. A particular reference of this array is not updated during add nor are the references in the array created here subject to updates until a reference to the new_node is added to a predecessor node at a specific level (except when the node being added might be removed after it has been linked on level 0).
Consider the following scenario:
- A domain performing
addis suspended after creating thenew_node_nextarray. - Another domain
removes one of the nodes to which thenew_node_nextarray has a reference at a non-zero level. - The domain performing
addis resumed and completes the operation.
What will happen then that the add will notice that a successor node was removed at around line 118+ as the compare_and_set_mark_ref fails. The add will then call find_in to update the preds and succs. This will not, however, update the reference in the new_node_next array, which was created based on an earlier succs. That is because the reference is at a level on which the new_node is not yet attached to the skip list (that is because the compare_and_set_mark_ref failed).
This means that after add returns, the new_node has been added to the skip list and the new_node contains a reference to a removed node. This means that the key contained in that removed node cannot be garbage collected. It will remain in memory until some call to find_in will notice that the removed node (due to marked references) and removes it. But there is no guarantee such a call will happen. It might never happen.
Am I missing something?
|
I wrote a lock-free skiplist from scratch somewhat inspired by what I learned from reviewing the code in this PR. I used a number of techniques to optimize it and it is roughly 1.75 times faster (6.34 M op/s vs 3.55 M op/s) on the benchmark in this PR. You can find the code in this gist. The code in the gist has some comments on some of the optimizations. The main improvement likely comes from the internal representation that avoids a level of indirection and takes less memory. |
|
I don't see a reason to retain this PR if the new implementation is 1.75x faster. It may be best to close this PR and open a new one with the code from the gist. |
|
I'm closing this PR as an improved implementation of this skiplist algorithm is proposed in PR #99. Thanks to all contributors! |
Implementation of a lock free skiplist taken directly from the Java implementation given in Herlihy et al's "The Art of Programming" 2nd Edition, section 14.4. There is a different implementation of [contains] from the textbook as the textbook implementation contains typos.
To do in this PR
get_random_level: returns a level of at least 1, why not 0 ? (it is not working with0but why ?)get_random_levelto remove the loop (as @polytypic suggested)To do in a future PR
=and<so with some types, it could be very costly or just not work at all.