- 
                Notifications
    
You must be signed in to change notification settings  - Fork 929
 
WeeklyTelcon_20171024
        Geoffrey Paulsen edited this page Jan 9, 2018 
        ·
        1 revision
      
    - Dialup Info: (Do not post to public mailing list or public wiki)
 
- Geoff Paulsen (IBM)
 - Jeff Squyres
 - Brian
 - Geoffroy Vallee
 - George
 - Howard
 - Josh Hursey
 - Mathew / SNL
 - Mohan
 - Nathan Hjelm
 - Ralph
 - Todd Kordenbrock
 - Edgar Gabriel
 
Review All Open Blockers
Review v2.0.x Milestones v2.0.4
- NEWS - Labor intensive to make NEWs every time. Can't we automate this?
 - Can we just use the short titles from the PR titles?
 - Not this week.
 - Don't include high sierra fix.
 - Schedule: Get it out this week.
 
Review v2.x Milestones v2.1.2
- v2.1.3 (unscheduled, but probably jan 19, 2018)
- PR4172 - a mix between feature / bugfix.
 
 - Are we going to do anything for v2.x for hwloc 2?
- At least put in a configure error if detects hwloc v2.x
 
 
Review v3.0.x Milestones v3.0
- v3.0.1
- Still targeting End of October for release of v3.0.1
 - a few PRs need review.
 - Schedule: Still shooting for End of October.
 
 
Review v3.1.x Milestones v3.1](https://github.com/open-mpi/ompi/milestone/27)
- 
v3.1.x -
- Roll hwloc back to 1.11.7 on v3.1.x branch (Ralph put together, Brian reviews)
 - Will support an external hwloc v2.0.x, but default will be hwloc 1.11.7.
 - PMIx - v3.1.0 was supposed to go out with PMIx 2.1.0 with cross version support
- Cross version support of PMIx is working fine, as long as not using PMIx shared memory.
- Fixing shared memory piece in v2.1 (with cross version support) needs a complete re-write.
 
 - Ticket out there, needs review,
 - Do we want to ship with PMIx v2.0 an no cross-memory support? Or PMIx v2.1, but don't support shared memory? (would have a number of build time flags to throw to get this to build).
 
 - Cross version support of PMIx is working fine, as long as not using PMIx shared memory.
 - Could delay...
 - Could we ship BOTH, and have the default be the PMIx v2.1 without shared memory
- provide a configure time flag to build with PMIx v2.0 to allow shared memory for high core-count platforms.
 - BUT, the backwards compatibile PMIx v2.1 still doesn't work with older PMIx versions if they were built with dstore (which is/was the default), so they have to go back and rebuild their PMIx stuff.
 
 - All of our options are BAD, so lets delay a week and discuss next week as to what we can do.
 - Send out an email to devel-core, and say we're going to delay v3.1 to fix it.
- Amazon will scope the amount of changes for dstore this week.
 
 
 - 
Schedule - Unsure, will see about above, and discuss next week.
 - 
Add v3.1 to MTT tests
- Database is active now to accept v3.1 tests.
 
 - 
MTT disks were getting full - PHP was trying to use /tmp, and local /tmp was full all weekend, so submissions weren't working. Josh moved what he could, but still thinks PHP is putting something in /tmp.
 - 
Administration
- Restored the Partner desgination.
 - Voted in Mexico Consortium
 
 
Review Master Master Pull Requests
- Looking reasonablly good, but history is all mucked up.
 - Something is going on with Jenkins (it looks like it's totally turned off right now)
 - Treematch segfault issue - just master?  We think.
- IBM has a patch we'll get PRed upstream, not sure if it fixes the same root issue others were seeing, but it fixes it in IBM's environment.
 
 - George accidentally pushed a branch 'v3.x' into upstream.
- Just delete it.
 
 - Jenkins - Botny Bay, and Berkly machines - both had issues where Jenkins couldn't ssh into those machines, and logged that it couldn't.
- This filled the disk, and ran us out of Web server credits.
 - Brian will send out config to Nathan on how to setup a daemon for connections so Jenkins won't sit in loop trying to ssh nodes it can't get to. He already has MAC-OSX config.
 - There is a wiki page with instructions also.
 - Brian will also put Jenkins on it's own partition to help isolate us.
 - When Jenkins goes bonkers it consumes all CPU cycles on the machine.
 
 - Discussed Issue 4349
- We seem to remember disabling it due to a real bug.
 - IBM will dig through notes and reply on Issue.
 
 
Review Master MTT testing
- Website - openmpi.org
- Brian trying to make things more automated, so can checkout repo, etc. Repo is TOO large.
 - Majority of the problem is the Tarballs. and already storing those in S3.
 
 
- 
Need to see if Attributes are MT - IBM will see if we have any tests to audit.
- Asked, need to get answer back from them.
 
 
- Jan / Feb
 - Possible locations: San Jose, Portland, Albuquerque, Dallas
 
- Mellanox, Sandia, Intel
 - LANL, Houston, IBM, Fujitsu
 - Amazon,
 - Cisco, ORNL, UTK, NVIDIA