You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It looks like this app (CP2K -- https://www.cp2k.org/) is experiencing a large slowdown with regards to MPI_ALLOC_MEM/MPI_FREE_MEM. Users cite that profiling has shown that it is spending 70% of its time in ALLOC_MEM, for example.
It's not immediately clear if MPI_ALLOC_MEM is being called too often (e.g., for buffers that don't really need to be registered), or if simply the switch to actively register/deregister buffers for every alloc / dealloc is significantly more expensive than Open MPI's usual lazy model of registration / deregistration.
Per discussion on the thread, and in discussions with @hjelmn, it sounds like we should:
Add an MCA param to disable de/registration during MPI_ALLOC|FREE_MEM.
Default the value of this MCA param to "disable" when memory hooks are enabled (i.e., the lazy mechanisms work quite well), and default it to "enable" when memory hooks are disabled (i.e., we want to let users effectively control their own registration/deregistration).
There were 3 separate threads about this issue on the Open MPI mailing list:
Short version:
It looks like this app (CP2K -- https://www.cp2k.org/) is experiencing a large slowdown with regards to MPI_ALLOC_MEM/MPI_FREE_MEM. Users cite that profiling has shown that it is spending 70% of its time in ALLOC_MEM, for example.
It's not immediately clear if MPI_ALLOC_MEM is being called too often (e.g., for buffers that don't really need to be registered), or if simply the switch to actively register/deregister buffers for every alloc / dealloc is significantly more expensive than Open MPI's usual lazy model of registration / deregistration.
Opening this issue to track the progress.
@loveshack @hiliev @JingchaoZhang @ggouaillardet @hjelmn @bosilca Feel free to tag others if they are interested.
The text was updated successfully, but these errors were encountered: