Skip to content

Remove scipy from required dependencies #1471

Closed
@laurenyu

Description

@laurenyu

One recurring issue is that the Python SDK is too big for AWS Lambda (#1200). The Lambda limit is 250 MB, and the Python SDK currently takes up 278 MB via a pip install sagemaker --target . and du -sh. The two biggest dependencies, at this point, are numpy and scipy. According to du -s, numpy takes 86 MB and scipy takes 126 MB.

scipy is used only for one function, and that function isn’t used anywhere in the Python SDK, so users who use the function likely already have scipy installed. We can follow in the line of Local Mode and TensorFlow dependencies (#1130), and use a DeferredError for the scipy import. Removing scipy would bring the Python SDK down to 152 MB.

Removing numpy in addition to scipy would bring the Python SDK down to 66 MB. Unfortunately, numpy is needed more widely: 1Ps, Scikit-learn, PyTorch, and Chainer. Given numpy‘s ubiquity, in addition to the fact that it’s not strictly needed for falling under the Lambda limit, let's keep numpy as a required dependency for now.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions