Skip to content

RFC: ompi_mpi_params.c: set mpi_add_procs_cutoff default to 0 #1340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 7 additions & 5 deletions ompi/runtime/ompi_mpi_params.c
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2006-2015 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2006-2016 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2007-2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2013 NVIDIA Corporation. All rights reserved.
Expand Down Expand Up @@ -64,7 +64,9 @@ int ompi_mpi_event_tick_rate = -1;
char *ompi_mpi_show_mca_params_string = NULL;
bool ompi_mpi_have_sparse_group_storage = !!(OMPI_GROUP_SPARSE);
bool ompi_mpi_preconnect_mpi = false;
uint32_t ompi_add_procs_cutoff = 1024;

#define OMPI_ADD_PROCS_CUTOFF_DEFAULT 0
uint32_t ompi_add_procs_cutoff = OMPI_ADD_PROCS_CUTOFF_DEFAULT;
bool ompi_mpi_dynamics_enabled = true;

static bool show_default_mca_params = false;
Expand Down Expand Up @@ -263,12 +265,12 @@ int ompi_mpi_register_params(void)
ompi_rte_abort(1, NULL);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjelmn I'm not sure why you say that this extra assignment is necessary (in jsquyres@cd5bad8#commitcomment-15887486). Even if we come through registration a 2nd time, we don't want to reset the default if the user already set a different value. What am I missing here?

Regardless, we should either be doing this extra assignment for all MCA params, or no MCA params -- if there's a reason to do the extra assignment for this MCA param, that same reason should apply to all MCA params, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should set the variable back to the default. That is the way I have done it pretty much everywhere throughout the code base. See the other registrations in this same function.

If we come through registration twice it means we finalized the project or component. I don't think we should use a possibly stale value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should note this is the default behavior of dynamically loaded components. If the component is unloaded all its variables will automatically be reset on the next dlopen. I was trying to be consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjelmn and I chatted on the phone -- he filled me in on what I was missing: there's a possible inconsistency in behavior with a program that does something like this:

MPI_T_Init(...);
MPI_T_Cvar_read(..., &value_pre);
MPI_T_Cvar_write(...);
MPI_T_Finalize(...);

MPI_Init(...)
MPI_T_Init(...);
MPI_T_Cvar_read(..., &value_post);
assert(value_pre == value_post);

When the cvar is part of a dynamically-loaded component, the assert will be true. If we remove the additional assignment, when the cvar is part of a statically-loaded component, the assert will be false. Nathan's second assignment ensures that the assert will be true in both cases.

This is actually a larger question for the MPI Tools Working Group in the MPI Forum: what consistency guarantees -- if any -- are provided by the MPI_T API when MPI_T and/or MPI is finalized?

For this PR, I'll remove my deletion of the 2nd assignment, and we'll leave OMPI's behavior in this area as it was. Nathan and I will bring up this what-does-MPI_T-guarantee issue with the Tools WG separately.


ompi_add_procs_cutoff = 1024;
ompi_add_procs_cutoff = OMPI_ADD_PROCS_CUTOFF_DEFAULT;
(void) mca_base_var_register ("ompi", "mpi", NULL, "add_procs_cutoff",
"Maximum world size for pre-allocating resources for all "
"remote processes. Increasing this limit may improve "
"communication performance at the cost of memory usage "
"(default: 1024)", MCA_BASE_VAR_TYPE_UNSIGNED_INT, NULL,
"communication performance at the cost of memory usage",
MCA_BASE_VAR_TYPE_UNSIGNED_INT, NULL,
0, 0, OPAL_INFO_LVL_3, MCA_BASE_VAR_SCOPE_LOCAL,
&ompi_add_procs_cutoff);

Expand Down