-
Notifications
You must be signed in to change notification settings - Fork 221
Memory refactor #1205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Memory refactor #1205
Conversation
feda70e to
52164c0
Compare
|
/ok to test f13a44e |
|
f13a44e to
d3dc347
Compare
d3dc347 to
44f7587
Compare
|
/ok to test 7c97d22 |
|
@Andy-Jost To frame the code review, can you fill in the PR description with more details about what the goals of the refactor were. |
Andy-Jost
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is a beast. I tried to leave some helpful comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file assembles the _memory package by combining the public elements of each submodule. This should match the public interface of the old _memory.pyx module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file contains Cython declarations related to Buffer. I also put the declaration of MemoryResource here because I couldn't find a better place.
Classes _cyBuffer and _cyMemoryResource were eliminated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file contains the implementation of Buffer.
| def _clear(self): | ||
| self._ptr = 0 | ||
| self._size = 0 | ||
| self._mr = None | ||
| self._ptr_obj = None | ||
| self._alloc_stream = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used by VMM. I did not try to change the logic of that code.
| stream: Stream = None | ||
| ) -> Buffer: | ||
| """Import a buffer that was exported from another process.""" | ||
| return _ipc.Buffer_from_ipc_descriptor(cls, mr, ipc_buffer, stream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation of IPC functions has been moved to the _ipc module.
| raise_if_driver_error(res2) | ||
|
|
||
| # Invalidate the old buffer so its destructor won't try to free again | ||
| buf._clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the only change to the VMM code. Cf. _memory.pyx:1432-5
| ) | ||
| ) | ||
| if attr == 1: | ||
| from cuda.core.experimental._memory import DeviceMemoryResource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this import should be delayed (like this) to avoid a circular dependency.
| from cuda.core.experimental._memory import DeviceMemoryResource | ||
| device._mr = DeviceMemoryResource(dev_id) | ||
| else: | ||
| from cuda.core.experimental._memory import _SynchronousMemoryResource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
|
||
| @memory_resource.setter | ||
| def memory_resource(self, mr): | ||
| from cuda.core.experimental._memory import MemoryResource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| # Receive the memory resource. | ||
| handle = mp.reduction.recv_handle(conn) | ||
| mr = DeviceMemoryResource.from_allocation_handle(device, handle) | ||
| os.close(handle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small functional change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 0d5f08b
|
/ok to test 0fac800 |
…a file descriptor (caller closes the fd).
|
/ok to test 0d5f08b |
|
/ok to test 567ea2c |
| def _clear(self): | ||
| self._ptr = 0 | ||
| self._size = 0 | ||
| self._mr = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Consider renaming _mr -> _memory_resource or mem_resource.
| stream: Stream | None = None | ||
| ): | ||
| cdef Buffer self = Buffer.__new__(cls) | ||
| self._ptr = <intptr_t>(int(ptr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a uintptr_t or a uint64_t?
|
/ok to test cf4dc9d |
|
/ok to test 19e4b8f |
Major refactoring of the memory package.
Overview
This PR refactors the
_memory.pyxmodule into a dedicated package (_memory/) to address its growing size and complexity, which were hindering further development. The primary goals are to physically separate the code into more manageable submodules, simplify the internal logic, and enhance the overall structure, including the addition of.pxdheaders for better Cython integration.Major Changes
_memory.pyxinto submodules, the major ones being the following:_buffer.*_dmr.*_ipc.*_vmm.*.pxd) for public definitions to improve modularity and type safety.DeviceMemoryResourceto isolate IPC-related code, reducing coupling.IPCDataclass to encapsulate relevant data members and eliminating a redundantuuidfield.Minor Improvements
__all__lists to modules for explicit control over exports._handleinstead of_mempool_handle).