-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Make numpy
and pandas
optional for ~7 times smaller deps
#153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
9e83480
acd8b93
658a4ca
41fb5d2
49941e4
69a42c6
8bd45b2
184248c
1d4a5af
cbe9446
054f9b4
129e6ba
1ffae5d
be99210
4721f67
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
""" | ||
This module helps make data libraries like `numpy` and `pandas` optional dependencies. | ||
|
||
The libraries add up to 130MB+, which makes it challenging to deploy applications | ||
using this library in environments with code size constraints, like AWS Lambda. | ||
|
||
This module serves as an import proxy and provides a few utilities for dealing with the optionality. | ||
|
||
Since the primary use case of this library (talking to the OpenAI API) doesn’t generally require data libraries, | ||
it’s safe to make them optional. The rare case when data libraries are needed in the client is handled through | ||
assertions with instructive error messages. | ||
|
||
See also `setup.py`. | ||
|
||
""" | ||
try: | ||
import numpy | ||
except ImportError: | ||
numpy = None | ||
|
||
try: | ||
import pandas | ||
except ImportError: | ||
pandas = None | ||
|
||
HAS_NUMPY = bool(numpy) | ||
HAS_PANDAS = bool(pandas) | ||
|
||
INSTRUCTIONS = """ | ||
|
||
OpenAI error: | ||
|
||
missing `{library}` | ||
|
||
This feature requires additional dependencies: | ||
|
||
$ pip install openai[datalib] | ||
|
||
""" | ||
|
||
NUMPY_INSTRUCTIONS = INSTRUCTIONS.format(library="numpy") | ||
PANDAS_INSTRUCTIONS = INSTRUCTIONS.format(library="pandas") | ||
|
||
|
||
class MissingDependencyError(Exception): | ||
pass | ||
|
||
|
||
def assert_has_numpy(): | ||
if not HAS_NUMPY: | ||
raise MissingDependencyError(NUMPY_INSTRUCTIONS) | ||
|
||
|
||
def assert_has_pandas(): | ||
if not HAS_PANDAS: | ||
raise MissingDependencyError(PANDAS_INSTRUCTIONS) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,8 +2,6 @@ | |
from typing import List, Optional | ||
|
||
import matplotlib.pyplot as plt | ||
import numpy as np | ||
import pandas as pd | ||
import plotly.express as px | ||
from scipy import spatial | ||
from sklearn.decomposition import PCA | ||
|
@@ -12,6 +10,8 @@ | |
from tenacity import retry, stop_after_attempt, wait_random_exponential | ||
|
||
import openai | ||
from openai.datalib import numpy as np | ||
from openai.datalib import pandas as pd | ||
Comment on lines
+13
to
+14
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if we should call There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The It could be improved, though. I think each optional extra — I wasn't sure whether you’d be interested in the PR, but it looks like you are, so I’ll polish it a bit: I’m thinking maybe throwing an It’s to a degree a backward-incompatible change (for existing users who don’t install There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh you're right this is an embeddings file so it will have the right dependencies. Regarding the backward-incompatibility, yes it's unfortunate but personally I think it's probably ok as long as the error is clear and explains how to resolve the problem. Also the line in See #124 for some historical context too about how deps have been handled too. |
||
|
||
|
||
@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice