From 35ae798d4df0957cc0b39cb9b289a3f10a677031 Mon Sep 17 00:00:00 2001 From: Leo Fang Date: Sat, 5 Apr 2025 00:14:56 +0000 Subject: [PATCH 1/9] document static cudart requirement --- cuda_bindings/docs/source/install.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/cuda_bindings/docs/source/install.md b/cuda_bindings/docs/source/install.md index 9c5570621..e15fa8a06 100644 --- a/cuda_bindings/docs/source/install.md +++ b/cuda_bindings/docs/source/install.md @@ -44,13 +44,17 @@ $ conda install -c conda-forge cuda-python ### Requirements * CUDA Toolkit headers[^1] +* static CUDA runtime[^2] [^1]: User projects that `cimport` CUDA symbols in Cython must also use CUDA Toolkit (CTK) types as provided by the `cuda.bindings` major.minor version. This results in CTK headers becoming a transitive dependency of downstream projects through CUDA Python. +[^2]: The static CUDA runtime (`libcudart_static.a` on Linux, `cudart_static.lib` on Windows) is part of CUDA Toolkit. If CUDA is installed from conda, it is contained in the `cuda-cudart-static` package. + Source builds require that the provided CUDA headers are of the same major.minor version as the `cuda.bindings` you're trying to build. Despite this requirement, note that the minor version compatibility is still maintained. Use the `CUDA_HOME` (or `CUDA_PATH`) environment variable to specify the location of your headers. For example, if your headers are located in `/usr/local/cuda/include`, then you should set `CUDA_HOME` with: ```console $ export CUDA_HOME=/usr/local/cuda +$ export LIBRARY_PATH=$CUDA_HOME/lib64:$LIBRARY_PATH ``` See [Environment Variables](environment_variables.md) for a description of other build-time environment variables. From 3cad06beea4e6e503776f98ca2245eb29ea70102 Mon Sep 17 00:00:00 2001 From: Leo Fang Date: Sat, 5 Apr 2025 00:27:06 +0000 Subject: [PATCH 2/9] fix cuda.coop/par docs + mention numba.cuda --- README.md | 14 +++++--------- cuda_python/docs/source/conf.py | 1 + cuda_python/docs/source/index.rst | 6 ++++-- 3 files changed, 10 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index 3639857f2..78e561837 100644 --- a/README.md +++ b/README.md @@ -4,12 +4,13 @@ CUDA Python is the home for accessing NVIDIA’s CUDA platform from Python. It c * [cuda.core](https://nvidia.github.io/cuda-python/cuda-core/latest): Pythonic access to CUDA Runtime and other core functionalities * [cuda.bindings](https://nvidia.github.io/cuda-python/cuda-bindings/latest): Low-level Python bindings to CUDA C APIs -* [cuda.cooperative](https://nvidia.github.io/cccl/cuda_cooperative/): A Python package for easy access to highly efficient and customizable parallel algorithms, like `sort`, `scan`, `reduce`, `transform`, etc. -* [cuda.parallel](https://nvidia.github.io/cccl/cuda_parallel/): A Python package providing CUB's reusable block-wide and warp-wide primitives for use within Numba CUDA kernels +* [cuda.cooperative](https://nvidia.github.io/cccl/cuda_cooperative/): A Python package providing CUB's reusable block-wide and warp-wide *device* primitives for use within Numba CUDA kernels +* [cuda.parallel](https://nvidia.github.io/cccl/cuda_parallel/): A Python package for easy access to highly efficient and customizable parallel algorithms, like `sort`, `scan`, `reduce`, `transform`, etc, that are callable on the *host*. +* [numba.cuda](https://nvidia.github.io/numba-cuda/): Numba's CUDA target for writing CUDA SIMT kernels in Python. For access to NVIDIA CPU & GPU Math Libraries, please refer to [nvmath-python](https://docs.nvidia.com/cuda/nvmath-python/latest). -CUDA Python is currently undergoing an overhaul to improve existing and bring up new components. All of the previously available functionalities from the cuda-python package will continue to be available, please refer to the [cuda.bindings](https://nvidia.github.io/cuda-python/cuda-bindings/latest) documentation for installation guide and further detail. +CUDA Python is currently undergoing an overhaul to improve existing and bring up new components. All of the previously available functionalities from the `cuda-python` package will continue to be available, please refer to the [cuda.bindings](https://nvidia.github.io/cuda-python/cuda-bindings/latest) documentation for installation guide and further detail. ## cuda-python as a metapackage @@ -37,9 +38,4 @@ The list of available interfaces are: * CUDA Runtime * NVRTC * nvJitLink - -## Supported Python Versions - -All `cuda-python` subpackages follows CPython [End-Of-Life](https://devguide.python.org/versions/) schedule for supported Python version guarantee. - -Before dropping support there will be an issue raised as a notice. +* NVVM diff --git a/cuda_python/docs/source/conf.py b/cuda_python/docs/source/conf.py index 8b2d757c4..fe6e934dc 100644 --- a/cuda_python/docs/source/conf.py +++ b/cuda_python/docs/source/conf.py @@ -95,4 +95,5 @@ .. _cuda.bindings: {CUDA_PYTHON_DOMAIN}/cuda-bindings/latest .. _cuda.cooperative: https://nvidia.github.io/cccl/cuda_cooperative/ .. _cuda.parallel: https://nvidia.github.io/cccl/cuda_parallel/ +.. _numba.cuda: https://nvidia.github.io/numba-cuda/ """ diff --git a/cuda_python/docs/source/index.rst b/cuda_python/docs/source/index.rst index 78b81a18d..beb2f3477 100644 --- a/cuda_python/docs/source/index.rst +++ b/cuda_python/docs/source/index.rst @@ -6,8 +6,9 @@ multiple components: - `cuda.core`_: Pythonic access to CUDA runtime and other core functionalities - `cuda.bindings`_: Low-level Python bindings to CUDA C APIs -- `cuda.cooperative`_: A Python package for easy access to highly efficient and customizable parallel algorithms, like `sort`, `scan`, `reduce`, `transform`, etc. -- `cuda.parallel`_: A Python package providing CUB's reusable block-wide and warp-wide primitives for use within Numba CUDA kernels +- `cuda.cooperative`_: A Python package providing CUB's reusable block-wide and warp-wide *device* primitives for use within Numba CUDA kernels +- `cuda.parallel`_: A Python package for easy access to highly efficient and customizable parallel algorithms, like ``sort``, ``scan``, ``reduce``, ``transform``, etc, that are callable on the *host* +- `numba.cuda`_: Numba's CUDA target for writing CUDA SIMT kernels in Python For access to NVIDIA CPU & GPU Math Libraries, please refer to `nvmath-python`_. @@ -30,5 +31,6 @@ be available, please refer to the `cuda.bindings`_ documentation for installatio cuda.bindings cuda.cooperative cuda.parallel + numba.cuda conduct.md contribute.md From f010f9bff47a39cb9bd484c4f3494743391d9dbd Mon Sep 17 00:00:00 2001 From: Leo Fang Date: Sat, 5 Apr 2025 00:47:09 +0000 Subject: [PATCH 3/9] add missing release note entries --- .../docs/source/release/11.8.7-notes.rst | 6 ++++++ .../docs/source/release/12.X.Y-notes.rst | 16 +++++++++++++++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/cuda_bindings/docs/source/release/11.8.7-notes.rst b/cuda_bindings/docs/source/release/11.8.7-notes.rst index 8f1257d0b..e951db91c 100644 --- a/cuda_bindings/docs/source/release/11.8.7-notes.rst +++ b/cuda_bindings/docs/source/release/11.8.7-notes.rst @@ -9,3 +9,9 @@ Highlights * The ``cuda.bindings.nvvm`` Python module was added, wrapping the `libNVVM C API `_. + + +Bug fixes +--------- + +* Fix segfault when converting char* NULL to bytes diff --git a/cuda_bindings/docs/source/release/12.X.Y-notes.rst b/cuda_bindings/docs/source/release/12.X.Y-notes.rst index f3a29d462..d6b903e2e 100644 --- a/cuda_bindings/docs/source/release/12.X.Y-notes.rst +++ b/cuda_bindings/docs/source/release/12.X.Y-notes.rst @@ -11,5 +11,19 @@ Highlights `libNVVM C API `_. * Source build error checking added for missing required headers * Statically link CUDA Runtime instead of reimplementing it -* Fix performance hint warnings raised by Cython 3 * Move stream callback wrappers to the Python layer +* Return code construction is made faster + +Bug fixes +--------- + +* Fix segfault when converting char* NULL to bytes + + +Miscellaneous +------------- + +* Benchmark suite is updated +* Improvements in the introductory code samples +* Fix performance hint warnings raised by Cython 3 +* Improvements in the Overview page From 7934aa815a3829f5bbd0033a1ccb2c04aec2f9c8 Mon Sep 17 00:00:00 2001 From: Leo Fang Date: Sat, 5 Apr 2025 00:52:35 +0000 Subject: [PATCH 4/9] improve cuda.core/bindings installation guides --- cuda_bindings/docs/source/install.md | 2 ++ cuda_core/docs/source/install.md | 6 +++--- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/cuda_bindings/docs/source/install.md b/cuda_bindings/docs/source/install.md index e15fa8a06..ecd00b571 100644 --- a/cuda_bindings/docs/source/install.md +++ b/cuda_bindings/docs/source/install.md @@ -4,8 +4,10 @@ `cuda.bindings` supports the same platforms as CUDA. Runtime dependencies are: +* Linux (x86-64, arm64) and Windows (x86-64) * Driver: Linux (450.80.02 or later) Windows (456.38 or later) * CUDA Toolkit 12.x +* Python 3.9 - 3.13 ```{note} Only the NVRTC and nvJitLink redistributable components are required from the CUDA Toolkit, which can be obtained via PyPI, Conda, or local installers (as described in the CUDA Toolkit [Windows](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html) and [Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) Installation Guides). diff --git a/cuda_core/docs/source/install.md b/cuda_core/docs/source/install.md index 6f59324c9..46745b721 100644 --- a/cuda_core/docs/source/install.md +++ b/cuda_core/docs/source/install.md @@ -12,7 +12,7 @@ dependencies are as follows: [^1]: Including `cuda-python`. -`cuda.core` supports Python 3.9 - 3.12, on Linux (x86-64, arm64) and Windows (x86-64). +`cuda.core` supports Python 3.9 - 3.13, on Linux (x86-64, arm64) and Windows (x86-64). ## Installing from PyPI @@ -22,8 +22,8 @@ $ pip install cuda-core[cu12] ``` and likewise use `[cu11]` for CUDA 11. -Note that using `cuda.core` with NVRTC or nvJitLink installed from PyPI via `pip install` is currently -not supported. This will be fixed in a future release. +Note that using `cuda.core` with NVRTC or nvJitLink installed from PyPI via `pip install` requires +`cuda.bindings` 12.8.0+ or 11.8.6+. ## Installing from Conda (conda-forge) From 277c23d6e1d750e9b375ad960207ccc89a3752ce Mon Sep 17 00:00:00 2001 From: Leo Fang Date: Sun, 6 Apr 2025 19:20:59 +0000 Subject: [PATCH 5/9] consolidate installation guides to single source --- cuda_bindings/DESCRIPTION.rst | 27 ++++----------------------- cuda_bindings/README.md | 19 ++----------------- cuda_bindings/docs/source/install.md | 6 +++--- 3 files changed, 9 insertions(+), 43 deletions(-) diff --git a/cuda_bindings/DESCRIPTION.rst b/cuda_bindings/DESCRIPTION.rst index 5dd065e4d..8a925f2dd 100644 --- a/cuda_bindings/DESCRIPTION.rst +++ b/cuda_bindings/DESCRIPTION.rst @@ -1,26 +1,7 @@ -******************************************************* +**************************************** cuda.bindings: Low-level CUDA interfaces -******************************************************* +**************************************** -`cuda.bindings` is a standard set of low-level interfaces, providing full coverage of and access to the CUDA host APIs from Python. Checkout the `Overview `_ for the workflow and performance results. +`cuda.bindings` is a standard set of low-level interfaces, providing full coverage of and 1:1 access to the CUDA host APIs from Python. Checkout the `Overview `_ for the workflow and performance results. -Installation -============ - -`cuda.bindings` can be installed from: - -* PyPI -* Conda (conda-forge/nvidia channels) -* Source builds - -Differences between these options are described in `Installation `_ documentation. Each package guarantees minor version compatibility. - -Runtime Dependencies -==================== - -`cuda.bindings` is supported on all the same platforms as CUDA. Specific dependencies are as follows: - -* Driver: Linux (450.80.02 or later) Windows (456.38 or later) -* CUDA Toolkit 12.x - -Only the NVRTC and nvJitLink redistributable components are required from the CUDA Toolkit, which can be obtained via PyPI, Conda, or local installers (as described in the CUDA Toolkit `Windows `_ and `Linux `_ Installation Guides). +For the installation instruction, please refer to the `Installation `_ page. diff --git a/cuda_bindings/README.md b/cuda_bindings/README.md index b468fb0ec..47233f612 100644 --- a/cuda_bindings/README.md +++ b/cuda_bindings/README.md @@ -1,27 +1,12 @@ # `cuda.bindings`: Low-level CUDA interfaces -`cuda.bindings` is a standard set of low-level interfaces, providing full coverage of and access to the CUDA host APIs from Python. Checkout the [Overview](https://nvidia.github.io/cuda-python/cuda-bindings/latest/overview.html) for the workflow and performance results. +`cuda.bindings` is a standard set of low-level interfaces, providing full coverage of and access to the CUDA host APIs from Python. Checkout the [Overview page](https://nvidia.github.io/cuda-python/cuda-bindings/latest/overview.html) for the workflow and performance results. `cuda.bindings` is a subpackage of `cuda-python`. ## Installing -`cuda.bindings` can be installed from: - -* PyPI -* Conda (conda-forge/nvidia channels) -* Source builds - -Differences between these options are described in [Installation](https://nvidia.github.io/cuda-python/cuda-bindings/latest/install.html) documentation. Each package guarantees minor version compatibility. - -## Runtime Dependencies - -`cuda.bindings` is supported on all the same platforms as CUDA. Specific dependencies are as follows: - -* Driver: Linux (450.80.02 or later) Windows (456.38 or later) -* CUDA Toolkit 12.x - -Only the NVRTC and nvJitLink redistributable components are required from the CUDA Toolkit, which can be obtained via PyPI, Conda, or local installers (as described in the CUDA Toolkit [Windows](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html) and [Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) Installation Guides). +Please refer to the [Installation page](https://nvidia.github.io/cuda-python/cuda-bindings/latest/install.html) for instructions and required/optional dependencies. ## Developing diff --git a/cuda_bindings/docs/source/install.md b/cuda_bindings/docs/source/install.md index ecd00b571..1bc64bfb2 100644 --- a/cuda_bindings/docs/source/install.md +++ b/cuda_bindings/docs/source/install.md @@ -5,12 +5,12 @@ `cuda.bindings` supports the same platforms as CUDA. Runtime dependencies are: * Linux (x86-64, arm64) and Windows (x86-64) -* Driver: Linux (450.80.02 or later) Windows (456.38 or later) -* CUDA Toolkit 12.x * Python 3.9 - 3.13 +* Driver: Linux (450.80.02 or later) Windows (456.38 or later) +* Optionally, NVRTC, nvJitLink, and NVVM from CUDA Toolkit 12.x ```{note} -Only the NVRTC and nvJitLink redistributable components are required from the CUDA Toolkit, which can be obtained via PyPI, Conda, or local installers (as described in the CUDA Toolkit [Windows](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html) and [Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) Installation Guides). +The optional CUDA Toolkit components can be installed via PyPI, Conda, OS-specific package managers, or local installers (as described in the CUDA Toolkit [Windows](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html) and [Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) Installation Guides). ``` Starting from v12.8.0, `cuda-python` becomes a meta package which currently depends only on `cuda-bindings`; in the future more sub-packages will be added to `cuda-python`. In the instructions below, we still use `cuda-python` as example to serve existing users, but everything is applicable to `cuda-bindings` as well. From cfdfd8c81fb9767a4aba18690aed8786f05514e9 Mon Sep 17 00:00:00 2001 From: Leo Fang Date: Sun, 6 Apr 2025 20:14:45 +0000 Subject: [PATCH 6/9] enable intersphinx to fix crossrefs; also enable copy button --- cuda_bindings/docs/source/conf.py | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/cuda_bindings/docs/source/conf.py b/cuda_bindings/docs/source/conf.py index bf9c08472..995cecca5 100644 --- a/cuda_bindings/docs/source/conf.py +++ b/cuda_bindings/docs/source/conf.py @@ -18,7 +18,7 @@ # -- Project information ----------------------------------------------------- project = "cuda.bindings" -copyright = "2021-2024, NVIDIA" +copyright = "2021-2025, NVIDIA" author = "NVIDIA" # The full version, including alpha/beta/rc tags @@ -30,7 +30,14 @@ # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. -extensions = ["sphinx.ext.autodoc", "sphinx.ext.napoleon", "myst_nb", "enum_tools.autoenum"] +extensions = [ + "sphinx.ext.autodoc", + "sphinx.ext.napoleon", + "sphinx.ext.intersphinx", + "myst_nb", + "enum_tools.autoenum", + "sphinx_copybutton", +] nb_execution_mode = "off" numfig = True @@ -43,6 +50,9 @@ # This pattern also affects html_static_path and html_extra_path. exclude_patterns = [] +# The reST default role (used for this markup: `text`) to use for all documents. +default_role = "cpp:any" + # -- Options for HTML output ------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for @@ -85,6 +95,16 @@ # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ["_static"] +# skip cmdline prompts +copybutton_exclude = ".linenos, .gp" + +intersphinx_mapping = { + "python": ("https://docs.python.org/3/", None), + "numpy": ("https://numpy.org/doc/stable/", None), + "nvvm": ("https://docs.nvidia.com/cuda/libnvvm-api/", None), + "nvjitlink": ("https://docs.nvidia.com/cuda/nvjitlink/", None), +} + suppress_warnings = [ # for warnings about multiple possible targets, see NVIDIA/cuda-python#152 "ref.python", From d8ed71313a7f708570402c6cb2818e1e374c48a8 Mon Sep 17 00:00:00 2001 From: Leo Fang Date: Mon, 7 Apr 2025 01:16:02 +0000 Subject: [PATCH 7/9] misc fixes --- cuda_bindings/docs/build_docs.sh | 2 +- cuda_bindings/docs/source/overview.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/cuda_bindings/docs/build_docs.sh b/cuda_bindings/docs/build_docs.sh index 8ace0b330..ffeece9da 100755 --- a/cuda_bindings/docs/build_docs.sh +++ b/cuda_bindings/docs/build_docs.sh @@ -17,7 +17,7 @@ fi # version selector or directory structure. if [[ -z "${SPHINX_CUDA_BINDINGS_VER}" ]]; then export SPHINX_CUDA_BINDINGS_VER=$(python -c "from importlib.metadata import version; \ - ver = '.'.join(str(version('cuda-python')).split('.')[:3]); \ + ver = '.'.join(str(version('cuda-bindings')).split('.')[:3]); \ print(ver)" \ | awk -F'+' '{print $1}') fi diff --git a/cuda_bindings/docs/source/overview.md b/cuda_bindings/docs/source/overview.md index bfd027b5f..1168d926f 100644 --- a/cuda_bindings/docs/source/overview.md +++ b/cuda_bindings/docs/source/overview.md @@ -205,7 +205,7 @@ argument on either host or device. Since we already prepared each of our argumen construction of our final contiguous array is done by retrieving the `XX.ctypes.data` of each kernel argument. -```{code-cell} python +```python args = [a, dX, dY, dOut, n] args = np.array([arg.ctypes.data for arg in args], dtype=np.uint64) ``` From 24ec10944bf01630eddee6fce1b802e63efc3819 Mon Sep 17 00:00:00 2001 From: Leo Fang Date: Mon, 7 Apr 2025 01:35:32 +0000 Subject: [PATCH 8/9] make cpp:any a local role to suppress sphinx warnings --- cuda_bindings/docs/source/conf.py | 3 --- cuda_bindings/docs/source/module/nvjitlink.rst | 2 ++ cuda_bindings/docs/source/module/nvvm.rst | 2 ++ 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/cuda_bindings/docs/source/conf.py b/cuda_bindings/docs/source/conf.py index 995cecca5..5ea8b4443 100644 --- a/cuda_bindings/docs/source/conf.py +++ b/cuda_bindings/docs/source/conf.py @@ -50,9 +50,6 @@ # This pattern also affects html_static_path and html_extra_path. exclude_patterns = [] -# The reST default role (used for this markup: `text`) to use for all documents. -default_role = "cpp:any" - # -- Options for HTML output ------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for diff --git a/cuda_bindings/docs/source/module/nvjitlink.rst b/cuda_bindings/docs/source/module/nvjitlink.rst index 79f5cd106..973cd5c44 100644 --- a/cuda_bindings/docs/source/module/nvjitlink.rst +++ b/cuda_bindings/docs/source/module/nvjitlink.rst @@ -1,3 +1,5 @@ +.. default-role:: cpp:any + nvjitlink ========= diff --git a/cuda_bindings/docs/source/module/nvvm.rst b/cuda_bindings/docs/source/module/nvvm.rst index 502d20ea1..33da86a64 100644 --- a/cuda_bindings/docs/source/module/nvvm.rst +++ b/cuda_bindings/docs/source/module/nvvm.rst @@ -1,3 +1,5 @@ +.. default-role:: cpp:any + nvvm ==== From 2759ae8f5a7913c4ee3e158c0c865bdd48bd9c6d Mon Sep 17 00:00:00 2001 From: Leo Fang Date: Mon, 7 Apr 2025 11:15:48 -0400 Subject: [PATCH 9/9] Apply suggestions from code review Co-authored-by: Keith Kraus --- README.md | 6 +++--- cuda_bindings/docs/source/install.md | 4 ++-- cuda_python/docs/source/index.rst | 6 +++--- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 78e561837..25b3b87af 100644 --- a/README.md +++ b/README.md @@ -4,9 +4,9 @@ CUDA Python is the home for accessing NVIDIA’s CUDA platform from Python. It c * [cuda.core](https://nvidia.github.io/cuda-python/cuda-core/latest): Pythonic access to CUDA Runtime and other core functionalities * [cuda.bindings](https://nvidia.github.io/cuda-python/cuda-bindings/latest): Low-level Python bindings to CUDA C APIs -* [cuda.cooperative](https://nvidia.github.io/cccl/cuda_cooperative/): A Python package providing CUB's reusable block-wide and warp-wide *device* primitives for use within Numba CUDA kernels -* [cuda.parallel](https://nvidia.github.io/cccl/cuda_parallel/): A Python package for easy access to highly efficient and customizable parallel algorithms, like `sort`, `scan`, `reduce`, `transform`, etc, that are callable on the *host*. -* [numba.cuda](https://nvidia.github.io/numba-cuda/): Numba's CUDA target for writing CUDA SIMT kernels in Python. +* [cuda.cooperative](https://nvidia.github.io/cccl/cuda_cooperative/): A Python package providing CCCL's reusable block-wide and warp-wide *device* primitives for use within Numba CUDA kernels +* [cuda.parallel](https://nvidia.github.io/cccl/cuda_parallel/): A Python package for easy access to CCCL's highly efficient and customizable parallel algorithms, like `sort`, `scan`, `reduce`, `transform`, etc, that are callable on the *host* +* [numba.cuda](https://nvidia.github.io/numba-cuda/): Numba's target for CUDA GPU programming by directly compiling a restricted subset of Python code into CUDA kernels and device functions following the CUDA execution model. For access to NVIDIA CPU & GPU Math Libraries, please refer to [nvmath-python](https://docs.nvidia.com/cuda/nvmath-python/latest). diff --git a/cuda_bindings/docs/source/install.md b/cuda_bindings/docs/source/install.md index 1bc64bfb2..175e304e6 100644 --- a/cuda_bindings/docs/source/install.md +++ b/cuda_bindings/docs/source/install.md @@ -46,11 +46,11 @@ $ conda install -c conda-forge cuda-python ### Requirements * CUDA Toolkit headers[^1] -* static CUDA runtime[^2] +* CUDA Runtime static library[^2] [^1]: User projects that `cimport` CUDA symbols in Cython must also use CUDA Toolkit (CTK) types as provided by the `cuda.bindings` major.minor version. This results in CTK headers becoming a transitive dependency of downstream projects through CUDA Python. -[^2]: The static CUDA runtime (`libcudart_static.a` on Linux, `cudart_static.lib` on Windows) is part of CUDA Toolkit. If CUDA is installed from conda, it is contained in the `cuda-cudart-static` package. +[^2]: The CUDA Runtime static library (`libcudart_static.a` on Linux, `cudart_static.lib` on Windows) is part of the CUDA Toolkit. If using conda packages, it is contained in the `cuda-cudart-static` package. Source builds require that the provided CUDA headers are of the same major.minor version as the `cuda.bindings` you're trying to build. Despite this requirement, note that the minor version compatibility is still maintained. Use the `CUDA_HOME` (or `CUDA_PATH`) environment variable to specify the location of your headers. For example, if your headers are located in `/usr/local/cuda/include`, then you should set `CUDA_HOME` with: diff --git a/cuda_python/docs/source/index.rst b/cuda_python/docs/source/index.rst index beb2f3477..9a492d1ef 100644 --- a/cuda_python/docs/source/index.rst +++ b/cuda_python/docs/source/index.rst @@ -6,9 +6,9 @@ multiple components: - `cuda.core`_: Pythonic access to CUDA runtime and other core functionalities - `cuda.bindings`_: Low-level Python bindings to CUDA C APIs -- `cuda.cooperative`_: A Python package providing CUB's reusable block-wide and warp-wide *device* primitives for use within Numba CUDA kernels -- `cuda.parallel`_: A Python package for easy access to highly efficient and customizable parallel algorithms, like ``sort``, ``scan``, ``reduce``, ``transform``, etc, that are callable on the *host* -- `numba.cuda`_: Numba's CUDA target for writing CUDA SIMT kernels in Python +- `cuda.cooperative`_: A Python package providing CCCL's reusable block-wide and warp-wide *device* primitives for use within Numba CUDA kernels +- `cuda.parallel`_: A Python package for easy access to CCCL's highly efficient and customizable parallel algorithms, like ``sort``, ``scan``, ``reduce``, ``transform``, etc, that are callable on the *host* +- `numba.cuda`_: Numba's target for CUDA GPU programming by directly compiling a restricted subset of Python code into CUDA kernels and device functions following the CUDA execution model. For access to NVIDIA CPU & GPU Math Libraries, please refer to `nvmath-python`_.