forked from intel/llvm
    
        
        - 
                Notifications
    
You must be signed in to change notification settings  - Fork 3
 
Sycl graph poc #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Closed
      
      
    
  
     Closed
                    Changes from 18 commits
      Commits
    
    
            Show all changes
          
          
            20 commits
          
        
        Select commit
          Hold shift + click to select a range
      
      7c62056
              
                Inital version of sycl graph prototype
              
              
                reble 59bb7da
              
                Adding initial sycl graph doc
              
              
                reble 528017a
              
                Reusing command list for re-execution (WIP)
              
              
                reble aee48a5
              
                Adding lazy execution property to queue
              
              
                reble 24fa5a9
              
                fix merge
              
              
                reble b7ce271
              
                Update pi_level_zero.cpp
              
              
                reble f3d30ed
              
                update extension proposal started to incorporate feedback
              
              
                reble 0e96d12
              
                typo
              
              
                reble 8f1a8dc
              
                Apply suggestions from code review
              
              
                reble a8c7265
              
                fix typos and syntax issues
              
              
                reble 5055f59
              
                Merge branch 'sycl' of github.com:reble/llvm into sycl
              
              
                reble 60507c1
              
                Propagate lazy queue property
              
              
                julianmi 7221deb
              
                Merge branch 'sycl' into sycl-graph-update
              
              
                reble 9209b57
              
                fix formatting issues
              
              
                reble 7208ad4
              
                fix issue introd. by recent merge
              
              
                reble 5d4eab1
              
                fix formatting
              
              
                reble 5956e20
              
                Extend API and incorporate more feedback.
              
              
                reble be05f1d
              
                Delete SYCL_EXT_ONEAPI_GRAPH.asciidoc
              
              
                reble f03870e
              
                update API to recent proposal
              
              
                reble 9fe0962
              
                fix rebase issue
              
              
                reble File filter
Filter by extension
Conversations
          Failed to load comments.   
        
        
          
      Loading
        
  Jump to
        
          Jump to file
        
      
      
          Failed to load files.   
        
        
          
      Loading
        
  Diff view
Diff view
There are no files selected for viewing
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              | Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,223 @@ | ||
| //==--------- graph.hpp --- SYCL graph extension ---------------------------==// | ||
| // | ||
| // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
| // See https://llvm.org/LICENSE.txt for license information. | ||
| // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
| // | ||
| //===----------------------------------------------------------------------===// | ||
| 
     | 
||
| #pragma once | ||
| 
     | 
||
| #include <CL/sycl/detail/defines_elementary.hpp> | ||
| 
     | 
||
| #include <list> | ||
| #include <set> | ||
| 
     | 
||
| __SYCL_INLINE_NAMESPACE(cl) { | ||
| namespace sycl { | ||
| namespace ext { | ||
| namespace oneapi { | ||
| namespace experimental { | ||
| namespace detail { | ||
| 
     | 
||
| struct node_impl; | ||
| 
     | 
||
| struct graph_impl; | ||
| 
     | 
||
| using node_ptr = std::shared_ptr<node_impl>; | ||
| 
     | 
||
| using graph_ptr = std::shared_ptr<graph_impl>; | ||
| 
     | 
||
| class wrapper { | ||
| using T = std::function<void(sycl::handler &)>; | ||
| T my_func; | ||
| std::vector<sycl::event> my_deps; | ||
| 
     | 
||
| public: | ||
| wrapper(T t, const std::vector<sycl::event> &deps) | ||
| : my_func(t), my_deps(deps){}; | ||
| 
     | 
||
| void operator()(sycl::handler &cgh) { | ||
| cgh.depends_on(my_deps); | ||
| std::invoke(my_func, cgh); | ||
| } | ||
| }; | ||
| 
     | 
||
| struct node_impl { | ||
| bool is_scheduled; | ||
| 
     | 
||
| graph_ptr my_graph; | ||
| sycl::event my_event; | ||
| 
     | 
||
| std::vector<node_ptr> my_successors; | ||
| std::vector<node_ptr> my_predecessors; | ||
| 
     | 
||
| std::function<void(sycl::handler &)> my_body; | ||
| 
     | 
||
| void exec(sycl::queue q) { | ||
| std::vector<sycl::event> __deps; | ||
| for (auto i : my_predecessors) | ||
| __deps.push_back(i->get_event()); | ||
| my_event = q.submit(wrapper{my_body, __deps}); | ||
| } | ||
| 
     | 
||
| void register_successor(node_ptr n) { | ||
| my_successors.push_back(n); | ||
| n->register_predecessor(node_ptr(this)); | ||
| } | ||
| 
     | 
||
| void register_predecessor(node_ptr n) { my_predecessors.push_back(n); } | ||
| 
     | 
||
| sycl::event get_event(void) { return my_event; } | ||
| 
     | 
||
| template <typename T> | ||
| node_impl(graph_ptr g, T cgf) | ||
| : is_scheduled(false), my_graph(g), my_body(cgf) {} | ||
| 
     | 
||
| // Recursively adding nodes to execution stack: | ||
| void topology_sort(std::list<node_ptr> &schedule) { | ||
| is_scheduled = true; | ||
| for (auto i : my_successors) { | ||
| if (!i->is_scheduled) | ||
| i->topology_sort(schedule); | ||
| } | ||
| schedule.push_front(node_ptr(this)); | ||
| } | ||
| }; | ||
| 
     | 
||
| struct graph_impl { | ||
| std::set<node_ptr> my_roots; | ||
| std::list<node_ptr> my_schedule; | ||
| 
     | 
||
| graph_ptr parent; | ||
| 
     | 
||
| void exec(sycl::queue q) { | ||
| if (my_schedule.empty()) { | ||
| for (auto n : my_roots) { | ||
| n->topology_sort(my_schedule); | ||
| } | ||
| } | ||
| for (auto n : my_schedule) | ||
| n->exec(q); | ||
| } | ||
| 
     | 
||
| void exec_and_wait(sycl::queue q) { | ||
| exec(q); | ||
| q.wait(); | ||
| } | ||
| 
     | 
||
| void add_root(node_ptr n) { | ||
| my_roots.insert(n); | ||
| for (auto n : my_schedule) | ||
| n->is_scheduled = false; | ||
| my_schedule.clear(); | ||
| } | ||
| 
     | 
||
| void remove_root(node_ptr n) { | ||
| my_roots.erase(n); | ||
| for (auto n : my_schedule) | ||
| n->is_scheduled = false; | ||
| my_schedule.clear(); | ||
| } | ||
| 
     | 
||
| graph_impl() {} | ||
| }; | ||
| 
     | 
||
| } // namespace detail | ||
| 
     | 
||
| class node; | ||
| 
     | 
||
| class graph; | ||
| 
     | 
||
| class executable_graph; | ||
| 
     | 
||
| struct node { | ||
| // TODO: add properties to distinguish between empty, host, device nodes. | ||
| detail::node_ptr my_node; | ||
| detail::graph_ptr my_graph; | ||
| 
     | 
||
| template <typename T> | ||
| node(detail::graph_ptr g, T cgf) | ||
| : my_graph(g), my_node(new detail::node_impl(g, cgf)){}; | ||
| void register_successor(node n) { my_node->register_successor(n.my_node); } | ||
| void exec(sycl::queue q, sycl::event = sycl::event()) { my_node->exec(q); } | ||
| 
     | 
||
| void set_root() { my_graph->add_root(my_node); } | ||
| 
     | 
||
| // TODO: Add query functions: is_root, ... | ||
| }; | ||
| 
     | 
||
| class executable_graph { | ||
| public: | ||
| int my_tag; | ||
| sycl::queue my_queue; | ||
| 
     | 
||
| void exec_and_wait(); // { my_queue.wait(); } | ||
| 
     | 
||
| executable_graph(detail::graph_ptr g, sycl::queue q) | ||
| : my_queue(q), my_tag(rand()) { | ||
| g->exec(my_queue); | ||
| } | ||
| }; | ||
| 
     | 
||
| class graph { | ||
| public: | ||
| // Adding empty node with [0..n] predecessors: | ||
| node add_empty_node(const std::vector<node> &dep = {}); | ||
| 
     | 
||
| // Adding node for host task | ||
| template <typename T> | ||
| node add_host_node(T hostTaskCallable, const std::vector<node> &dep = {}); | ||
| 
     | 
||
| // Adding device node: | ||
| template <typename T> | ||
| node add_device_node(T cgf, const std::vector<node> &dep = {}); | ||
| 
     | 
||
| // Adding dependency between two nodes. | ||
| void make_edge(node sender, node receiver); | ||
| 
     | 
||
| // TODO: Extend queue to directly submit graph | ||
| void exec_and_wait(sycl::queue q); | ||
| 
     | 
||
| executable_graph exec(sycl::queue q) { | ||
| return executable_graph{my_graph, q}; | ||
| }; | ||
| 
     | 
||
| graph() : my_graph(new detail::graph_impl()) {} | ||
| 
     | 
||
| // Creating a subgraph (with predecessors) | ||
| graph(graph &parent, const std::vector<node> &dep = {}) {} | ||
| 
     | 
||
| bool is_subgraph(); | ||
| 
     | 
||
| private: | ||
| detail::graph_ptr my_graph; | ||
| }; | ||
| 
     | 
||
| void executable_graph::exec_and_wait() { my_queue.wait(); } | ||
| 
     | 
||
| template <typename T> | ||
| node graph::add_device_node(T cgf, const std::vector<node> &dep) { | ||
| node _node(my_graph, cgf); | ||
| if (!dep.empty()) { | ||
| for (auto n : dep) | ||
| this->make_edge(n, _node); | ||
| } else { | ||
| _node.set_root(); | ||
| } | ||
| return _node; | ||
| } | ||
| 
     | 
||
| void graph::make_edge(node sender, node receiver) { | ||
| sender.register_successor(receiver); // register successor | ||
| my_graph->remove_root(receiver.my_node); // remove receiver from root node | ||
| // list | ||
| } | ||
| 
     | 
||
| void graph::exec_and_wait(sycl::queue q) { my_graph->exec_and_wait(q); }; | ||
| 
     | 
||
| } // namespace experimental | ||
| } // namespace oneapi | ||
| } // namespace ext | ||
| } // namespace sycl | ||
| } // __SYCL_INLINE_NAMESPACE(cl) | 
      
      Oops, something went wrong.
        
    
  
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an idea for an alternative approach here that doesn't involve adding a new PI entry-point, we use
piQueueFlushinstead. As it already exists in PI with the semantics of starting execution of work lazily scheduled, assuming it works the same as clFlush. I'd then expect a flush to happen on event wait as well as queue wait.piQueueFlushis also more generic than just for kernel execution commands. So if in the future we wanted more than kernel launch commands to be lazily executed, we wouldn't need to add another entry-point. e.g. could lazily enqueuepiEnqueueMemBufferCopycommands and flush them. Rather than having to add apiMemBufferCopyentry-point to match thepiEnqueueMemBufferCopylike we've done here withpiEnqueueKernelLaunch.