Skip to content

Commit 0c5cce9

Browse files
b1shtreamvgvassilev
authored andcommitted
Fix typo and add research papers to Cppyy-CUDA introduction blogpost
1 parent 48b60fb commit 0c5cce9

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

_posts/2024-05-30-enable-cuda-compilation-cppyy-numba-generated-ir.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,11 @@ I got introduced to this project while researching on my personal research proje
2121

2222
### Introduction to Cppyy and the problem statement
2323

24-
Cppyy is an automatic python-C++ runtime binding generator that helps to call C++ code from Python and vice-versa. This enables interoperability between different language ecosystems, avoids the cross-language overhead, and promotes heterogeneous computing. The initial support for Numba, a Python JIT Compiler has been added which compiles looped code containing C++ objects/methods/functions defined via Cppyy into fast machine code. This proposed project seeks to leverage Cppyy's integration with Numba, a high-performance Python compiler, to enable the compilation of CUDA C++ code defined via Cppyy into efficient machine code.
24+
Cppyy is an automatic python-C++ runtime binding generator that helps to call C++ code from Python and vice-versa. This enables interoperability between two different language ecosystems, avoids the cross-language overhead, and promotes heterogeneous computing. The initial support for Numba, a Python JIT Compiler has been added which compiles looped code containing C++ objects/methods/functions defined via Cppyy into fast machine code. This proposed project seeks to leverage Cppyy's integration with Numba, a high-performance Python compiler, to enable the compilation of CUDA C++ code defined via Cppyy into efficient machine code.
2525

2626
### Importance of this project
2727

28-
As we know, heterogeneous computing is the future. The scientific community is heavily relying on GPGPU(General-Purpose Graphics Processing Unit)computations, that incorporate CPUs as well as GPUs for running workloads based on their requirements. This architecture of GPGPUs generates a need for scientists to understand the low-level graphics APIs like CUDA(Compute Unified Device Architecture) which comes with a whole new learning curve, instead, we can use Python language which is more familiar to the scientific ecosystem. Cppyy can help provide efficient Python-CUDA C++ bindings during runtime. This enables scientists to leverage GPU acceleration in a much more user-friendly language with rich ecosystems like Python without compromising on performance. Based on research, python can be slow as compared to other performant systems programming languages like C++ so we will use Numba, a high-performance Python JIT compiler that will produce fast machine code out of our Python code.
28+
As we know, heterogeneous computing is the future. The scientific community heavily rely on GPGPU(General-Purpose Graphics Processing Unit)computations, that incorporate CPUs as well as GPUs for running workloads based on their requirements. This architecture of GPGPUs generates a need for scientists to understand the low-level graphics APIs like CUDA(Compute Unified Device Architecture) which comes with a whole new learning curve, instead, we can use Python language which is more familiar to the scientific ecosystem. Cppyy can help provide efficient Python-CUDA C++ bindings during runtime. This enables scientists to leverage GPU acceleration in a much more user-friendly language, Python, with a rich ecosystem without compromising on performance. Based on research, Python can be slow as compared to other performant systems programming languages like C++ so we will use Numba, a high-performance Python JIT compiler that will produce fast machine code out of Python code.
2929

3030
### Implementation Approach and Plans
3131

@@ -34,14 +34,12 @@ Milestones of this project include:
3434
By separating the CUDA and C++ code execution paths, cppyy can provide a more stable and efficient environment for integrating CUDA functionality into Python.
3535

3636
2. **Designing and developing CUDA compilation pipeline**: At present, the CUDA compilation is supported by adding CUDA headers to PCH(Pre-compiled headers) during runtime but this provides control to Cling, which is an interactive C++ interpreter. We want to take control from Cling and provide it to Numba using numba decorators while it invokes GPU kernels from Cppyy. Numba uses the proxies to obtain function pointers and then runs the LLVM compilation passes using `llvmlite`. That's why the scope of the project is to utilize numba so we don’t have to deal with Cling. This can include adding:
37-
- Support of helpers in `numba_ext.py` to simplify the process of launching CUDA kernels directly from Python.
38-
- Support of CUDA-specific data types in `LLVM IR`.
39-
40-
The research is still ongoing for this part of the project.
37+
- Support of helpers in `numba_ext.py` to simplify the process of launching CUDA kernels directly from Python.
38+
- Support of CUDA-specific data types in `LLVM IR`. [The research is still ongoing for this part of the project.]
4139

4240
3. **Testing and Documentation support**: Prepare comprehensive tests to ensure functionality and robustness. Create detailed documentation including debugging guides for users and developers.
4341

44-
4. **Future scope**: To provide further optimization techniques for extracting more performance out of GPUs
42+
4. **Future scope**: To provide further optimization techniques for extracting more performance out of GPUs.
4543

4644
Upon successful completion, a possible proof-of-concept can be expected in the below code snippet:
4745

@@ -66,7 +64,9 @@ This would allow Python users to utilize CUDA for parallel computing on GPUs whi
6664

6765
### Conclusion
6866

69-
The impact of this project extends far beyond Cppyy itself, as it empowers the scientific community by providing Python users with direct access to the performance and capabilities of C++ libraries. The CUDA support in the Python ecosystem through Cppyy and Numba can help accelerate the research and development in Scientific Computing domains like Data analysis(ROOT), Machine Learning, and computational sciences like simulating genetic code, protein structures, etc that rely on both languages.
67+
The impact of this project extends far beyond Cppyy itself, as it empowers the scientific community by providing Python users with direct access to the performance and capabilities of C++ libraries. The CUDA support in the Python ecosystem through Cppyy and Numba can help accelerate the research and development in Scientific Computing domains like Data analysis(ROOT), Machine Learning, and computational sciences like simulating genetic code, protein structures, etc that rely on both languages. The following papers shows the importance of CUDA and GPU acceleration in scientific community:
68+
- Simulations use GPUs to run the world's largest simulations on the world's largest supercomputer: [Link](https://escholarship.org/content/qt5q63r9ph/qt5q63r9ph_noSplash_29f23cdb21b554ab0457d33f14e9d6e0.pdf)
69+
- This enables to perform GPU-accelerated modeling and seamless GPU-accelerated, zero-copy extensions of the fast codes from Python. Useful for rapid prototyping of new physics modules, development of in situ analysis as well as coupling multiple codes and codes with ML frameworks and the data science ecosystem: [Link]( https://arxiv.org/abs/2402.17248)
7070

7171
### Related Links
7272

0 commit comments

Comments
 (0)