You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Array4: __cuda_array_interface__ v2
Start implementing the `__cuda_array_interface__` for zero-copy
data exchange on Nvidia CUDA GPUs.
* MultiFab: CuPy Test
* `MFIter`: `Finalize()` on `StopIteration`
Since `for` loops create no scope in Python, we need to trigger
finalize logic, including stream syncs, before the destructor of
`MultiFab` iterators are called.
* Add numba test
incl. 3D kernel launch
* Add pytorch
* CuPy Fuse: Avoid Extra Memset
* MultiFab Device Test: Fixes
* Update to v3
* Array4: TODO from CUDA
A bit tricky to implement this caster as new constructor.
Not currently needed, but adds comments where to do this.
Co-authored-by: Remi Lehe <[email protected]>
// Because the user of the interface may or may not be in the same context, the most common case is to use cuPointerGetAttribute with CU_POINTER_ATTRIBUTE_DEVICE_POINTER in the CUDA driver API (or the equivalent CUDA Runtime API) to retrieve a device pointer that is usable in the currently active context.
169
+
// TODO For zero-size arrays, use 0 here.
170
+
171
+
// None or integer
172
+
// An optional stream upon which synchronization must take place at the point of consumption, either by synchronizing on the stream or enqueuing operations on the data on the given stream. Integer values in this entry are as follows:
173
+
// 0: This is disallowed as it would be ambiguous between None and the default stream, and also between the legacy and per-thread default streams. Any use case where 0 might be given should either use None, 1, or 2 instead for clarity.
174
+
// 1: The legacy default stream.
175
+
// 2: The per-thread default stream.
176
+
// Any other integer: a cudaStream_t represented as a Python integer.
0 commit comments