Skip to content

Dependency Overkill in setup.py #17519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jbrockmendel opened this issue Sep 13, 2017 · 5 comments
Closed

Dependency Overkill in setup.py #17519

jbrockmendel opened this issue Sep 13, 2017 · 5 comments

Comments

@jbrockmendel
Copy link
Member

The extensions dependencies defined in ext_data are passed to the distutils.extension.Extension constructor as

extensions = []

for name, data in ext_data.items():
    sources = [srcpath(data['pyxfile'], suffix=suffix, subdir='')]
    pxds = [pxd(x) for x in data.get('pxdfiles', [])]
    if suffix == '.pyx' and pxds:
        sources.extend(pxds)

    sources.extend(data.get('sources', []))

    include = data.get('include', common_include)

    obj = Extension('pandas.%s' % name,
                    sources=sources,
                    depends=data.get('depends', []),
                    include_dirs=include,
                    extra_compile_args=extra_compile_args)

    extensions.append(obj)

None of the data dicts has a include key, so they all end up passing include=common_include=['pandas/_libs/src/klib', 'pandas/_libs/src']. As a result(?) a bunch of the dependencies that are listed can be removed without breaking the build. e.g. I can completely get rid of lib_depends and every appearance of `_libs/src/util' without any apparent consequences.

I'm just speculating because I don't really get how depends vs include_dirs works, but it seems like having common_include added to everything is massive overkill. For most of these exts they only need a couple of the files from _libs/src/. But poking at it, I have not figured out what combination of args is needed to specify e.g. reshape.pyx cimports src/util.pxd which needs src/numpy_helper.h and src/headers/stdint.h without breaking the build.

So the questions:

  1. Is my intuition correct that there is more stuff being passed to the constructor than is really necessary?
  2. Is there a cost to passing unnecessary junk? e.g. slower build time?
  3. Is there a way to specify exactly the needed files instead of throwing the kitchen sink at it?
@jreback
Copy link
Contributor

jreback commented Sep 13, 2017

this would just add complexity
basically this is a search path
des are already specified for c imports in .pyx
so most of these are just common things

would rather we fix the inline warnings!

@jbrockmendel
Copy link
Member Author

would rather we fix the inline warnings!

I will gladly make this PR.

Totally happy to not-add complexity. Really asking for my own edification (and hopefully understanding this will make me less likely to screw up future PRs). Specific questions about two examples and then I'll drop this.

Super-simple example is _libs.tslibs.timezones which doesn't import any other cython, so adding 'include': [] to this entry (so it doesn't get set to the default common_include) doesn't break anything. The question: does trimming down the include_dirs arg lead to any savings (presumably at compile-time)?

For the other direction let's look at _libs.tslibs.frequencies. This module has cimport dependencies on cython, numpy and util. util relies on numpy_helper.h and headers/stdint.h. If we were so inclined, how would we specify exactly the needed dependencies and not the kitchen sink?

@jreback
Copy link
Contributor

jreback commented Sep 14, 2017

you can try and see if it reduces build time but i suspect it won't very much at all

@jbrockmendel
Copy link
Member Author

Again, not looking to change anything, just trying to understand how it works.

@jreback
Copy link
Contributor

jreback commented Sep 14, 2017

i think you would have to dive into how disutils assembles stuff / prob can just try some things and see

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants