-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
The current handling of dependencies is quite monolithic: users must install them all regardless of the subset of features they want to use. We should make Haystack more modular at install time.
Options
Nowadays there are several ways to properly handle dependency groups:
- several
requirement.txt
files: quite old fashioned by now and a bit harder to manage extras_require
insetup.py
: "traditional" way, safe and widely usedpyproject.toml
: the new way, as recommended by PEP517 and PEP660.
Proposed dependency groups
minimal
: basic Haystack on CPU with one single document store (inMemory maybe)gpu
: for running Haystack on GPUrest
: install also the REST server API depsui
: install Streamlit depsdemo
:rest
+ui
ci
: for GitHub runnerswin
: for Windows installs (if possible)colab
: to workaround Colab specific issues when necessary- One group for each document store
all_doc_stores
: install all possible dependency from document storestest
for the test dependenciesdocs
: for building documentationcode
: black, linter and possible extra tools if/when we introduce themall
(ordev
): complete dependency list for development and contributing. Includes all of the above.
We can also consider adding smaller groups for special components with exotic dependencies, like crawler
, ocr
, etc.
Default install
It's up to debate what the default install (pip install haystack
) should look like.
The important point is that the dependencies that are installed in this case must be marked as mandatory. This at least is the case for extras_require
in setup.py
, and might have changed in pyproject.toml
. If it's the case, the default install should be effectively a minimal install. For example, if we include GPU deps in this group, they will become mandatory, and having a pure CPU install will be impossible.
I will investigate the options and update this section with new information.
Related issues
Related to #1291, #1716, #1826, #1806
Closes #1070
Next steps
- Learn more about what's currently possible with
pyproject.toml
and whether all of our dependencies can actually work with it. As of last year that were still some issues with large libraries that needed complex build steps. - Finalize dependency groups list
- Define what a default install should look like
- Investigate how to properly handle failed imports for unmet dependencies
- Fix dependency related issues (like Improve Colab setup experience by simplifying dependencies #1806)