You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe such a change could reduce the uncompressed size of the package by about 150kb.
main
without tests
compressed
56k
48k
uncompressed
576k
424k
how I estimated this this
Place the following at the root of the repo, call it check-sizes.sh, and run sh check-sizes.sh
rm -rf dask-kubernetes.egg-info
rm -rf dist/
rm -rf __pycache__
echo""echo"building source distribution"echo""
python setup.py sdist > /dev/null
cp dask_kubernetes.egg-info/SOURCES.txt ~/DASK-K8S-SOURCES.txt
pushd dist/
echo""echo"sdist compressed size"echo""
du -a -h .
tar -xf dask-kubernetes*.tar.gz
rm dask-kubernetes*.tar.gz
ls .echo""echo"sdist uncompressed size"echo""
du -sh .popd
Change MANIFEST.in to the following and run the script again.
recursive-include dask_kubernetes *.py
recursive-include dask_kubernetes *.yaml
recursive-exclude dask_kubernetes/tests *
include setup.py
include setup.cfg
include LICENSE
include README.rst
include requirements.txt
include MANIFEST.in
include versioneer.py
recursive-exclude * __pycache__
recursive-exclude * *.py[co]
why I think this change should be considered
Even though the package is already fairly small and the savings here are not huge, I still think this is worth doing unless there is a compelling reason to allow tests to be run from the published package artifact (instead of from cloning the repo).
reduces amount of data transfer for PyPI and conda channels serving this package
reduces the total size of any container images built with dask-kubernetes in them (and, therefore, data transfer when those images are pulled + storage footprint of those images)
Excluding tests from distribution is a reasonably debated move. There is an opinion that being able to run tests on a package you installed via pip is a good way of testing a production environment. I was only discussing this with @Cadair on another project recently.
Given that the savings here are marginal I would be tempted not to do this. Although I would be interested to know what @jakirkham and @jrbourbeau think.
Short Description
Would maintainers here be open to a pull request that modifies
MANIFEST.in
to exclude test files from this project's package artifacts?Long Description
Today, all test files are included in the project's package artifacts.
python setup.py sdist > /dev/null cat dask_kubernetes.egg-info/SOURCES.txt
I believe such a change could reduce the uncompressed size of the package by about 150kb.
main
how I estimated this this
Place the following at the root of the repo, call it
check-sizes.sh
, and runsh check-sizes.sh
Change
MANIFEST.in
to the following and run the script again.why I think this change should be considered
Even though the package is already fairly small and the savings here are not huge, I still think this is worth doing unless there is a compelling reason to allow tests to be run from the published package artifact (instead of from cloning the repo).
dask-kubernetes
in them (and, therefore, data transfer when those images are pulled + storage footprint of those images)dask-kubernetes
as a referencereferences
Other projects that have adopted similar changes when I've proposed them:
Thanks for your time and consideration!
The text was updated successfully, but these errors were encountered: