Skip to content

nbdev_prepare throws BrokenProcessPool error on MacOS #731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
WNoxchi opened this issue Aug 3, 2022 · 20 comments
Closed

nbdev_prepare throws BrokenProcessPool error on MacOS #731

WNoxchi opened this issue Aug 3, 2022 · 20 comments
Labels
bug Something isn't working

Comments

@WNoxchi
Copy link

WNoxchi commented Aug 3, 2022

On NBDev 2.1.1, fastcore 1.5.14.

In (Arm) MacOS, running nbdev_prepare results in the following error:

objc[12084]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[12084]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Traceback (most recent call last):
  File "/Users/username/mambaforge/bin/nbdev_prepare", line 8, in <module>
    sys.exit(prepare())
  File "/Users/username/mambaforge/lib/python3.9/site-packages/nbdev/shortcuts.py", line 101, in prepare
    nbdev_test.__wrapped__()
  File "/Users/username/mambaforge/lib/python3.9/site-packages/nbdev/test.py", line 87, in nbdev_test
    results = parallel(test_nb, files, skip_flags=skip_flags, force_flags=force_flags, n_workers=n_workers, pause=pause, do_print=do_print)
  File "/Users/username/mambaforge/lib/python3.9/site-packages/fastcore/parallel.py", line 117, in parallel
    return L(r)
  File "/Users/username/mambaforge/lib/python3.9/site-packages/fastcore/foundation.py", line 98, in __call__
    return super().__call__(x, *args, **kwargs)
  File "/Users/username/mambaforge/lib/python3.9/site-packages/fastcore/foundation.py", line 106, in __init__
    items = listify(items, *rest, use_list=use_list, match=match)
  File "/Users/username/mambaforge/lib/python3.9/site-packages/fastcore/basics.py", line 66, in listify
    elif is_iter(o): res = list(o)
  File "/Users/username/mambaforge/lib/python3.9/concurrent/futures/process.py", line 559, in _chain_from_iterable_of_lists
    for element in iterable:
  File "/Users/username/mambaforge/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator
    yield fs.pop().result()
  File "/Users/username/mambaforge/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/Users/username/mambaforge/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

This first occurred using (I believe) nbdev 2.0.8 on a public repo ready for a commit. I replicated the error with an empty test public repo after upgrading to nbdev 2.1.1.

Steps to reproduce:

  • clone a new git repo
  • run nbdev_new
  • run nbdev_install_hooks
  • run nbdev_export
  • run nbdev_install
  • run nbdev_prepare

I was not able to replicate this issue on (arm64) Ubuntu 20.04 linux: nbdev_prepare runs successfully there. Note: in this linux test nbdev_new did not infer settings.ini, but did successfully take manual input.

I remember BrokenProcessPool errors in fastai years ago on MacOS, they were solved by setting the number of workers to 1 (or some other work around).


Update: Looks like the issue is in nbdev_test (nbdev_export and nbdev_clean work without issue). I don't have tests but there wasn't an issue on linux.

@jph00
Copy link
Contributor

jph00 commented Aug 3, 2022

Can you please provide a link to a github repo where you get this error?

@WNoxchi
Copy link
Author

WNoxchi commented Aug 3, 2022

Here is the empty repo I'm using for reproducing errors: https://github.com/WNoxchi/testrepo

I'm migrating a project to nbdev2 and decided to start fresh. https://github.com/WNoxchi/alphazero locally contains code ported over from https://github.com/WNoxchi/alphazero_prev (though now that it seems like export + clean isn't an issue, I'm going to update it tomorrow -- 10-12 hours from now).

@jph00
Copy link
Contributor

jph00 commented Aug 3, 2022 via email

@WNoxchi
Copy link
Author

WNoxchi commented Aug 3, 2022

The example repo is up at: https://github.com/WNoxchi/alphazero_temp

I discovered a second issue while putting it together: when calling nbdev_export after copying one of my notebooks -- looks like having a dictionary that stores class objects causes an AttributeError. I commented out the code cell and exported so it's visible at: https://github.com/WNoxchi/alphazero_temp/blob/main/search.ipynb.

Export and clean work, nbdev_test (and nbdev_prepare) throw the same error as before. I wonder if the original issue is as simple as me not having any tests defined yet, or missing something from settings.ini. I tried putting in some sample tests: running nbdev_test(..) in a notebook returned a success, but no luck on the CLI. I did not export these.

The export error on a dictionary containing a class object:

Traceback (most recent call last):
  File "/Users/hakan/mambaforge/bin/nbdev_export", line 8, in <module>
    sys.exit(nbdev_export())
  File "/Users/hakan/mambaforge/lib/python3.9/site-packages/fastcore/script.py", line 116, in _f
    return tfunc(**merge(args, args_from_prog(func, xtra)))
  File "/Users/hakan/mambaforge/lib/python3.9/site-packages/nbdev/doclinks.py", line 147, in nbdev_export
    for f in files: nb_export(f)
  File "/Users/hakan/mambaforge/lib/python3.9/site-packages/nbdev/export.py", line 62, in nb_export
    create_modules(nbname, lib_path, procs=[black_format])
  File "/Users/hakan/mambaforge/lib/python3.9/site-packages/nbdev/export.py", line 57, in create_modules
    mm.make(cells, all_cells, lib_path=dest)
  File "/Users/hakan/mambaforge/lib/python3.9/site-packages/nbdev/maker.py", line 193, in make
    _all = self.make_all(all_cells)
  File "/Users/hakan/mambaforge/lib/python3.9/site-packages/nbdev/maker.py", line 109, in make_all
    return retr_exports(cells.map(NbCell.parsed_).concat())
  File "/Users/hakan/mambaforge/lib/python3.9/site-packages/nbdev/maker.py", line 95, in retr_exports
    all_assigns = assigns.filter(lambda o: getattr(o.targets[0],'id',None)=='_all_')
  File "/Users/hakan/mambaforge/lib/python3.9/site-packages/fastcore/foundation.py", line 160, in filter
    return self._new(filter_ex(self, f=f, negate=negate, gen=gen, **kwargs))
  File "/Users/hakan/mambaforge/lib/python3.9/site-packages/fastcore/basics.py", line 636, in filter_ex
    return list(res)
  File "/Users/hakan/mambaforge/lib/python3.9/site-packages/nbdev/maker.py", line 95, in <lambda>
    all_assigns = assigns.filter(lambda o: getattr(o.targets[0],'id',None)=='_all_')
AttributeError: 'AugAssign' object has no attribute 'targets'

@jph00
Copy link
Contributor

jph00 commented Aug 4, 2022

Sorry @WNoxchi I might not be explaining very well what we need. In your issue you said "I replicated the error with an empty test public repo after upgrading to nbdev 2.1.1."

That's the repo we need a link to. The empty test public repo which you're seeing the error in. The link you provided is to a repo where you've got quite a bit of code and stuff, and we're not going to be able to wrap our heads around all that!

I discovered a second issue while putting it together: when calling nbdev_export after copying one of my notebooks -- looks like having a dictionary that stores class objects causes an AttributeError.

We'd love to help you fix that, but GitHub Issues isn't really the place. This is for filing reproducible bug reports. We need one issue per bug report. For help with your code, would you mind discussing it either on the forums on discord first? If it turns out it's due to a bug in nbdev, we can then create an issue here to track fixing the bug.

Sorry -- hope this isn't too inconvenient!

@WNoxchi
Copy link
Author

WNoxchi commented Aug 4, 2022

No problem, I'll put a post up on the forum later today.

@MichaelJFishmanBA
Copy link

MichaelJFishmanBA commented Aug 5, 2022

This looks the same as 673, which seemed fixed for a time.

This error is re-ocurring for me now, even without upgrading from 2.0.4. After upgrading to 2.1.1, it is still occuring.

Example repo here. I did nothing but nbdev_new in this repo.

(ml9) michael.fishman@michael nbdev_prepare_broken_example % nbdev_prepare
objc[5744]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[5744]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Traceback (most recent call last):
  File "/Users/michael.fishman/miniconda3/envs/ml9/bin/nbdev_prepare", line 33, in <module>
    sys.exit(load_entry_point('nbdev', 'console_scripts', 'nbdev_prepare')())
  File "/Users/michael.fishman/repos/nbdev/nbdev/shortcuts.py", line 101, in prepare
    nbdev_test.__wrapped__()
  File "/Users/michael.fishman/repos/nbdev/nbdev/test.py", line 87, in nbdev_test
    results = parallel(test_nb, files, skip_flags=skip_flags, force_flags=force_flags, n_workers=n_workers, pause=pause, do_print=do_print)
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/site-packages/fastcore/parallel.py", line 117, in parallel
    return L(r)
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/site-packages/fastcore/foundation.py", line 98, in __call__
    return super().__call__(x, *args, **kwargs)
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/site-packages/fastcore/foundation.py", line 106, in __init__
    items = listify(items, *rest, use_list=use_list, match=match)
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/site-packages/fastcore/basics.py", line 66, in listify
    elif is_iter(o): res = list(o)
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/concurrent/futures/process.py", line 559, in _chain_from_iterable_of_lists
    for element in iterable:
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator
    yield fs.pop().result()
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

Environment:

Python: 3.8.13
nbdev: 2.1.1
fastcore: 1.5.14
OS: macOS Monterey 12.4
Processor: Apple M1

Edit: Updated my environment info

@jph00
Copy link
Contributor

jph00 commented Aug 5, 2022

Thanks @MichaelJFishmanBA I'll take a look now.

@jph00
Copy link
Contributor

jph00 commented Aug 6, 2022

@MichaelJFishmanBA I tried cloning your repo and ran nbdev_prepare and it worked! Not sure what's different in your env, since it looks nearly identical to mine.

Can you confirm running nbdev_export works OK in that repo for you? Then can you try nbdev_test --n_workers 0? That will disable parallel processing, which might give us a better stack trace. Please paste any error you get in a reply so I can see what's going on. (@WNoxchi same request for you if you're still able to repro this issue please!)

@MichaelJFishmanBA
Copy link

MichaelJFishmanBA commented Aug 6, 2022

@jph00

nbdev_export ran without error. nbdev_test --n_workers 0 also ran without error.

Edit: nbdev_test (without specifying --n_workers 0) still gives me the error.

@jph00
Copy link
Contributor

jph00 commented Aug 6, 2022

Ouch OK that's going to be hard to debug!

How about if you first run export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES in the shell, then run nbdev_test -- does that resolve the problem?

@jph00
Copy link
Contributor

jph00 commented Aug 6, 2022

Also, can you run python -c 'import fastcore; print(fastcore.__version__)' to confirm what fastcore version you're using?

@MichaelJFishmanBA
Copy link

export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
nbdev_test

runs without error.

python -c 'import fastcore; print(fastcore.__version__)' gives 1.5.14

@jph00
Copy link
Contributor

jph00 commented Aug 6, 2022

@MichaelJFishmanBA I've updated execnb in master to avoid importing fastcore.xtras and fastcore.foundation. I wonder if that will fix it. Could you try installing execnb and nbdev from master and see if that resolve it (be sure to unset OBJC_DISABLE_INITIALIZE_FORK_SAFETY first).

@MichaelJFishmanBA
Copy link

@jph00

I

  1. unset OBJC_DISABLE_INITIALIZE_FORK_SAFETY
  2. uninstalled nbdev and execnb
  3. pulled and installed the master branch of each
  4. Ran nbdev_test

I still get this error:

(ml9) michael.fishman@michael nbdev_prepare_broken_example % nbdev_test          
objc[19935]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[19935]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Traceback (most recent call last):
  File "/Users/michael.fishman/miniconda3/envs/ml9/bin/nbdev_test", line 33, in <module>
    sys.exit(load_entry_point('nbdev', 'console_scripts', 'nbdev_test')())
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/site-packages/fastcore/script.py", line 116, in _f
    return tfunc(**merge(args, args_from_prog(func, xtra)))
  File "/Users/michael.fishman/repos/nbdev/nbdev/test.py", line 87, in nbdev_test
    results = parallel(test_nb, files, skip_flags=skip_flags, force_flags=force_flags, n_workers=n_workers, pause=pause, do_print=do_print)
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/site-packages/fastcore/parallel.py", line 117, in parallel
    return L(r)
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/site-packages/fastcore/foundation.py", line 98, in __call__
    return super().__call__(x, *args, **kwargs)
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/site-packages/fastcore/foundation.py", line 106, in __init__
    items = listify(items, *rest, use_list=use_list, match=match)
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/site-packages/fastcore/basics.py", line 66, in listify
    elif is_iter(o): res = list(o)
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/concurrent/futures/process.py", line 559, in _chain_from_iterable_of_lists
    for element in iterable:
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator
    yield fs.pop().result()
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/Users/michael.fishman/miniconda3/envs/ml9/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

@jph00
Copy link
Contributor

jph00 commented Aug 6, 2022

OK I'll leave this open until we have a more resilient approach to running NBs, but in the meantime, please just set OBJC_DISABLE_INITIALIZE_FORK_SAFETY in your .zshrc. (It's working on all our machines and on Mac in CI so I'm a bit lost as to what's different!)

@jph00 jph00 closed this as completed in fab3c33 Aug 6, 2022
@jph00
Copy link
Contributor

jph00 commented Aug 6, 2022

Pretty sure the commit I just pushed would have to fix this -- lemme know if it doesn't. It's not forking the parent process at all any more.

@jph00 jph00 added the bug Something isn't working label Aug 6, 2022
@MichaelJFishmanBA
Copy link

@jph00

I still get the error after upgrading to

nbdev==2.1.3
fastcore==1.5.16
execnb==0.1.1

I'll use the OBJC_DISABLE_INITIALIZE_FORK_SAFETY workaround for now.

@jph00
Copy link
Contributor

jph00 commented Aug 8, 2022

Someone else with this problem told us upgrading python solved it for them fyi

@gsganden
Copy link

I am getting the same error in a similar environment, initially with Python 3.9.11 and still after upgrading to Python 3.10.3. I am installing my library with pip install '.[dev]' inside a pyenv-virtualenv environment.

execnb: 0.1.1
fastcore: 1.5.16
nbdev: 2.1.3
OS: macOS Monterey 12.1
Processor: Apple M1 Pro
Repo branch (under heavy construction): https://github.com/gsganden/model_inspector/tree/rewrite

The export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES workaround is working, although nbdev_test is still giving me some warnings:

(model_inspector) ➜  model_inspector git:(rewrite) ✗ nbdev_test          
/Users/greg/.pyenv/versions/3.10.3/envs/model_inspector/lib/python3.10/site-packages/joblib/externals/loky/backend/resource_tracker.py:318: UserWarning: resource_tracker: There appear to be 2 leaked folder objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
/Users/greg/.pyenv/versions/3.10.3/envs/model_inspector/lib/python3.10/site-packages/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /var/folders/3q/lv46h47d0hn5fpfhdfr83vm40000gp/T/joblib_memmapping_folder_37203_e3a629b629a5400295fefdc864c96b67_ba483fba37f1472a927a0a394f1e3d26: FileNotFoundError(2, 'No such file or directory')
  warnings.warn('resource_tracker: %s: %r' % (name, e))
/Users/greg/.pyenv/versions/3.10.3/envs/model_inspector/lib/python3.10/site-packages/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /var/folders/3q/lv46h47d0hn5fpfhdfr83vm40000gp/T/joblib_memmapping_folder_37203_e3a629b629a5400295fefdc864c96b67_93e2d48172ce4d8dbdfd2bb716bc0f4d: FileNotFoundError(2, 'No such file or directory')
  warnings.warn('resource_tracker: %s: %r' % (name, e))
/Users/greg/.pyenv/versions/3.10.3/envs/model_inspector/lib/python3.10/site-packages/joblib/externals/loky/backend/resource_tracker.py:318: UserWarning: resource_tracker: There appear to be 6 leaked folder objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
/Users/greg/.pyenv/versions/3.10.3/envs/model_inspector/lib/python3.10/site-packages/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /var/folders/3q/lv46h47d0hn5fpfhdfr83vm40000gp/T/joblib_memmapping_folder_37204_299c733ddcee4c4ab3e9726d12ae1354_3b3b4dd165e84d13b3624e72ea6d9e80: FileNotFoundError(2, 'No such file or directory')
  warnings.warn('resource_tracker: %s: %r' % (name, e))
/Users/greg/.pyenv/versions/3.10.3/envs/model_inspector/lib/python3.10/site-packages/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /var/folders/3q/lv46h47d0hn5fpfhdfr83vm40000gp/T/joblib_memmapping_folder_37204_299c733ddcee4c4ab3e9726d12ae1354_683626eced384d748637878fdd1b9816: FileNotFoundError(2, 'No such file or directory')
  warnings.warn('resource_tracker: %s: %r' % (name, e))
/Users/greg/.pyenv/versions/3.10.3/envs/model_inspector/lib/python3.10/site-packages/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /var/folders/3q/lv46h47d0hn5fpfhdfr83vm40000gp/T/joblib_memmapping_folder_37204_1e530423617a4e9c8cd641b0199f92b0_3295daa093bb47b98b40919e8672bfd5: FileNotFoundError(2, 'No such file or directory')
  warnings.warn('resource_tracker: %s: %r' % (name, e))
/Users/greg/.pyenv/versions/3.10.3/envs/model_inspector/lib/python3.10/site-packages/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /var/folders/3q/lv46h47d0hn5fpfhdfr83vm40000gp/T/joblib_memmapping_folder_37204_299c733ddcee4c4ab3e9726d12ae1354_f7d26f3282c14860b3e979e38cb996c3: FileNotFoundError(2, 'No such file or directory')
  warnings.warn('resource_tracker: %s: %r' % (name, e))
/Users/greg/.pyenv/versions/3.10.3/envs/model_inspector/lib/python3.10/site-packages/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /var/folders/3q/lv46h47d0hn5fpfhdfr83vm40000gp/T/joblib_memmapping_folder_37204_2e48b385d620474b842c0f4ab25cd5b7_ae263ab756184f0793eb24082664a646: FileNotFoundError(2, 'No such file or directory')
  warnings.warn('resource_tracker: %s: %r' % (name, e))
/Users/greg/.pyenv/versions/3.10.3/envs/model_inspector/lib/python3.10/site-packages/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /var/folders/3q/lv46h47d0hn5fpfhdfr83vm40000gp/T/joblib_memmapping_folder_37204_299c733ddcee4c4ab3e9726d12ae1354_4745d863762d470a8f548a8e87e24bf8: FileNotFoundError(2, 'No such file or directory')
  warnings.warn('resource_tracker: %s: %r' % (name, e))
Success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants