Description
One recurring issue is that the Python SDK is too big for AWS Lambda (#1200). The Lambda limit is 250 MB, and the Python SDK currently takes up 278 MB via a pip install sagemaker --target .
and du -sh
. The two biggest dependencies, at this point, are numpy and scipy. According to du -s
, numpy takes 86 MB and scipy takes 126 MB.
scipy is used only for one function, and that function isn’t used anywhere in the Python SDK, so users who use the function likely already have scipy installed. We can follow in the line of Local Mode and TensorFlow dependencies (#1130), and use a DeferredError
for the scipy import. Removing scipy would bring the Python SDK down to 152 MB.
Removing numpy in addition to scipy would bring the Python SDK down to 66 MB. Unfortunately, numpy is needed more widely: 1Ps, Scikit-learn, PyTorch, and Chainer. Given numpy‘s ubiquity, in addition to the fact that it’s not strictly needed for falling under the Lambda limit, let's keep numpy as a required dependency for now.