Skip to content

llama-cpp-python v0.2.0 #499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
Sep 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
6cb77a2
Migrate to scikit-build-core. Closes #489
abetlen Jul 18, 2023
5eab1db
Merge branch 'main' into v0.2-wip
abetlen Jul 18, 2023
19ba9d3
Use numpy arrays for logits_processors and stopping_criteria. Closes …
abetlen Jul 18, 2023
792b981
Fix numpy dependency
abetlen Jul 18, 2023
7ce6cdf
Update supported python versions.
abetlen Jul 18, 2023
d2c5afe
Remove prerelease python version
abetlen Jul 18, 2023
57db1f9
Update development docs for scikit-build-core. Closes #490
abetlen Jul 19, 2023
b43917c
Add functions parameters
abetlen Jul 19, 2023
0b121a7
Format
abetlen Jul 19, 2023
0538ba1
Merge branch 'main' into v0.2-wip
abetlen Jul 20, 2023
436036a
Merge branch 'main' into v0.2-wip
abetlen Jul 21, 2023
77c9f49
Merge branch 'main' into v0.2-wip
abetlen Jul 24, 2023
3434803
Merge branch 'main' into v0.2-wip
abetlen Jul 24, 2023
1b6997d
Convert constants to python types and allow python types in low-level…
abetlen Jul 24, 2023
bf90177
Add llama_sample_grammar
abetlen Jul 24, 2023
078902a
Add llama_grammar_accept_token
abetlen Jul 24, 2023
cf405f6
Merge branch 'main' into v0.2-wip
abetlen Aug 24, 2023
ac47d55
Merge branch 'main' into v0.2-wip
abetlen Aug 25, 2023
1910793
Merge branch 'main' into v0.2-wip
abetlen Sep 12, 2023
5458427
Disable metal for ci test builds
abetlen Sep 12, 2023
b053cf7
Fix typo
abetlen Sep 12, 2023
082c2a2
disable all acceleration on macos ci builds
abetlen Sep 12, 2023
685a929
typo
abetlen Sep 12, 2023
f93fb30
Set native arch flags for macos
abetlen Sep 12, 2023
010a501
Add tune
abetlen Sep 12, 2023
fa2f1fd
Enable accelerations and set python architecture
abetlen Sep 12, 2023
9547a35
Try arm64 python
abetlen Sep 12, 2023
04a6bbe
Revert test changes
abetlen Sep 12, 2023
cf9e613
Update scikit-build-core options
abetlen Sep 12, 2023
d24383e
Disable acceleration on macos
abetlen Sep 12, 2023
4c0787b
Disable acceleration in macos tests only
abetlen Sep 12, 2023
dadfd96
Use compiler to determine best optimizations for platform
abetlen Sep 12, 2023
d123129
fix
abetlen Sep 12, 2023
2c3df16
Reorder
abetlen Sep 12, 2023
4cb0e35
string options
abetlen Sep 12, 2023
e65a823
Update flags
abetlen Sep 12, 2023
e3387e4
Add explanatory comment
abetlen Sep 12, 2023
6bddf62
Add python 3.12 to tests
abetlen Sep 12, 2023
fe743b4
Revert python 3.12 tests
abetlen Sep 12, 2023
e00d182
Bump scikit-build-core
abetlen Sep 12, 2023
bb4e67e
Using dynamic version
abetlen Sep 12, 2023
6e89775
Bump version
abetlen Sep 12, 2023
1dd3f47
Remove references to FORCE_CMAKE
abetlen Sep 12, 2023
89ae347
Remove references to force_cmake
abetlen Sep 12, 2023
bcef9ab
Update title
abetlen Sep 12, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions .github/workflows/build-and-release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ jobs:

- name: Install dependencies
run: |
python -m pip install --upgrade pip pytest cmake scikit-build setuptools
python -m pip install --upgrade pip
python -m pip install -e .[all]

- name: Build wheels
run: python -m cibuildwheel --output-dir wheelhouse
Expand All @@ -46,10 +47,11 @@ jobs:
- uses: actions/setup-python@v3
- name: Install dependencies
run: |
python -m pip install --upgrade pip pytest cmake scikit-build setuptools
python -m pip install --upgrade pip build
python -m pip install -e .[all]
- name: Build source distribution
run: |
python setup.py sdist
python -m build --sdist
- uses: actions/upload-artifact@v3
with:
path: ./dist/*.tar.gz
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/publish-to-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,11 @@ jobs:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip pytest cmake scikit-build setuptools
python3 -m pip install --upgrade pip build
python3 -m pip install -e .[all]
- name: Build source distribution
run: |
python setup.py sdist
python3 -m build --sdist
- name: Publish to Test PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/publish.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,11 @@ jobs:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip pytest cmake scikit-build setuptools
python3 -m pip install --upgrade pip build
python3 -m pip install -e .[all]
- name: Build source distribution
run: |
python setup.py sdist
python3 -m build --sdist
- name: Publish distribution to PyPI
# TODO: move to tag based releases
# if: startsWith(github.ref, 'refs/tags')
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/test-pypi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
- name: Install dependencies
run: |
python3 -m pip install --upgrade pip
python3 -m pip install --verbose llama-cpp-python[server,test]
python3 -m pip install --verbose llama-cpp-python[all]
- name: Test with pytest
run: |
python3 -c "import llama_cpp"
Expand All @@ -38,7 +38,7 @@ jobs:
- name: Install dependencies
run: |
python3 -m pip install --upgrade pip
python3 -m pip install --verbose llama-cpp-python[server,test]
python3 -m pip install --verbose llama-cpp-python[all]
- name: Test with pytest
run: |
python3 -c "import llama_cpp"
Expand All @@ -58,7 +58,7 @@ jobs:
- name: Install dependencies
run: |
python3 -m pip install --upgrade pip
python3 -m pip install --verbose llama-cpp-python[server,test]
python3 -m pip install --verbose llama-cpp-python[all]
- name: Test with pytest
run: |
python3 -c "import llama_cpp"
24 changes: 12 additions & 12 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
python-version: ["3.8", "3.9", "3.10", "3.11"]

steps:
- uses: actions/checkout@v3
Expand All @@ -26,18 +26,18 @@ jobs:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip pytest cmake scikit-build setuptools fastapi sse-starlette httpx uvicorn pydantic-settings
pip install . -v
python3 -m pip install --upgrade pip
python3 -m pip install .[all] -v
- name: Test with pytest
run: |
pytest
python3 -m pytest

build-windows:

runs-on: windows-latest
strategy:
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
python-version: ["3.8", "3.9", "3.10", "3.11"]

steps:
- uses: actions/checkout@v3
Expand All @@ -49,18 +49,18 @@ jobs:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip pytest cmake scikit-build setuptools fastapi sse-starlette httpx uvicorn pydantic-settings
pip install . -v
python3 -m pip install --upgrade pip
python3 -m pip install .[all] -v
- name: Test with pytest
run: |
pytest
python3 -m pytest

build-macos:

runs-on: macos-latest
strategy:
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
python-version: ["3.8", "3.9", "3.10", "3.11"]

steps:
- uses: actions/checkout@v3
Expand All @@ -72,8 +72,8 @@ jobs:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip pytest cmake scikit-build setuptools fastapi sse-starlette httpx uvicorn pydantic-settings
pip install . -v
python3 -m pip install --upgrade pip
python3 -m pip install .[all] --verbose
- name: Test with pytest
run: |
pytest
python3 -m pytest
48 changes: 26 additions & 22 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,33 +2,37 @@ cmake_minimum_required(VERSION 3.4...3.22)

project(llama_cpp)

option(FORCE_CMAKE "Force CMake build of Python bindings" OFF)
option(LLAMA_BUILD "Build llama.cpp shared library and install alongside python package" ON)

set(FORCE_CMAKE $ENV{FORCE_CMAKE})

if (UNIX AND NOT FORCE_CMAKE)
add_custom_command(
OUTPUT ${CMAKE_CURRENT_SOURCE_DIR}/vendor/llama.cpp/libllama.so
COMMAND make libllama.so
WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/vendor/llama.cpp
)
add_custom_target(
run ALL
DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/vendor/llama.cpp/libllama.so
)
install(
FILES ${CMAKE_CURRENT_SOURCE_DIR}/vendor/llama.cpp/libllama.so
DESTINATION llama_cpp
)
else()
if (LLAMA_BUILD)
set(BUILD_SHARED_LIBS "On")
if (APPLE)
# Need to disable these llama.cpp flags on Apple
# otherwise users may encounter invalid instruction errors
set(LLAMA_AVX "Off" CACHE BOOL "llama: enable AVX" FORCE)
set(LLAMA_AVX2 "Off" CACHE BOOL "llama: enable AVX2" FORCE)
set(LLAMA_FMA "Off" CACHE BOOL "llama: enable FMA" FORCE)
set(LLAMA_F16C "Off" CACHE BOOL "llama: enable F16C" FORCE)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -march=native -mtune=native")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native -mtune=native")
endif()
add_subdirectory(vendor/llama.cpp)
install(
TARGETS llama
LIBRARY DESTINATION llama_cpp
RUNTIME DESTINATION llama_cpp
ARCHIVE DESTINATION llama_cpp
FRAMEWORK DESTINATION llama_cpp
RESOURCE DESTINATION llama_cpp
LIBRARY DESTINATION ${SKBUILD_PLATLIB_DIR}/llama_cpp
RUNTIME DESTINATION ${SKBUILD_PLATLIB_DIR}/llama_cpp
ARCHIVE DESTINATION ${SKBUILD_PLATLIB_DIR}/llama_cpp
FRAMEWORK DESTINATION ${SKBUILD_PLATLIB_DIR}/llama_cpp
RESOURCE DESTINATION ${SKBUILD_PLATLIB_DIR}/llama_cpp
)
# Temporary fix for https://github.com/scikit-build/scikit-build-core/issues/374
install(
TARGETS llama
LIBRARY DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
RUNTIME DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
ARCHIVE DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
FRAMEWORK DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
RESOURCE DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
)
endif()
18 changes: 11 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,26 +5,30 @@ update:
update.vendor:
cd vendor/llama.cpp && git pull origin master

deps:
python3 -m pip install pip
python3 -m pip install -e ".[all]"

build:
python3 setup.py develop
python3 -m pip install -e .

build.cuda:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 python3 setup.py develop
CMAKE_ARGS="-DLLAMA_CUBLAS=on" python3 -m pip install -e .

build.opencl:
CMAKE_ARGS="-DLLAMA_CLBLAST=on" FORCE_CMAKE=1 python3 setup.py develop
CMAKE_ARGS="-DLLAMA_CLBLAST=on" python3 -m pip install -e .

build.openblas:
CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 python3 setup.py develop
CMAKE_ARGS="-DLLAMA_CLBLAST=on" python3 -m pip install -e .

build.blis:
CMAKE_ARGS="-DLLAMA_OPENBLAS=on -DLLAMA_OPENBLAS_VENDOR=blis" FORCE_CMAKE=1 python3 setup.py develop
CMAKE_ARGS="-DLLAMA_OPENBLAS=on -DLLAMA_OPENBLAS_VENDOR=blis" python3 -m pip install -e .

build.metal:
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 python3 setup.py develop
CMAKE_ARGS="-DLLAMA_METAL=on" python3 -m pip install -e .

build.sdist:
python3 setup.py sdist
python3 -m build --sdist

deploy.pypi:
python3 -m twine upload dist/*
Expand Down
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# 🦙 Python Bindings for `llama.cpp`
# 🦙 Python Bindings for [`llama.cpp`](https://github.com/ggerganov/llama.cpp)

[![Documentation Status](https://readthedocs.org/projects/llama-cpp-python/badge/?version=latest)](https://llama-cpp-python.readthedocs.io/en/latest/?badge=latest)
[![Tests](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml)
Expand Down Expand Up @@ -48,7 +48,6 @@ Otherwise, while installing it will build the llama.ccp x86 version which will b
### Installation with Hardware Acceleration

`llama.cpp` supports multiple BLAS backends for faster processing.
Use the `FORCE_CMAKE=1` environment variable to force the use of `cmake` and install the pip package for the desired BLAS backend.

To install with OpenBLAS, set the `LLAMA_BLAS and LLAMA_BLAS_VENDOR` environment variables before installing:

Expand Down Expand Up @@ -208,24 +207,26 @@ If you find any issues with the documentation, please open an issue or submit a

This package is under active development and I welcome any contributions.

To get started, clone the repository and install the package in development mode:
To get started, clone the repository and install the package in editable / development mode:

```bash
git clone --recurse-submodules https://github.com/abetlen/llama-cpp-python.git
cd llama-cpp-python

# Upgrade pip (required for editable mode)
pip install --upgrade pip

# Install with pip
pip install -e .

# if you want to use the fastapi / openapi server
pip install -e .[server]

# If you're a poetry user, installing will also include a virtual environment
poetry install --all-extras
. .venv/bin/activate
# to install all optional dependencies
pip install -e .[all]

# Will need to be re-run any time vendor/llama.cpp is updated
python3 setup.py develop
# to clear the local build cache
make clean
```

# How does this compare to other Python bindings of `llama.cpp`?
Expand Down
2 changes: 1 addition & 1 deletion docker/cuda_simple/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ ENV LLAMA_CUBLAS=1
RUN python3 -m pip install --upgrade pip pytest cmake scikit-build setuptools fastapi uvicorn sse-starlette pydantic-settings

# Install llama-cpp-python (build with cuda)
RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python

# Run the server
CMD python3 -m llama_cpp.server
4 changes: 2 additions & 2 deletions docker/simple/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ RUN mkdir /app
WORKDIR /app
COPY . /app

RUN python3 -m pip install --upgrade pip pytest cmake scikit-build setuptools fastapi uvicorn sse-starlette pydantic-settings
RUN python3 -m pip install --upgrade pip

RUN make build && make clean
RUN make deps && make build && make clean

# Set environment variable for the host
ENV HOST=0.0.0.0
Expand Down
5 changes: 4 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,9 +82,12 @@ To get started, clone the repository and install the package in development mode

```bash
git clone [email protected]:abetlen/llama-cpp-python.git
cd llama-cpp-python
git submodule update --init --recursive
# Will need to be re-run any time vendor/llama.cpp is updated
python3 setup.py develop

pip install --upgrade pip
pip install -e .[all]
```

## License
Expand Down
2 changes: 1 addition & 1 deletion docs/install/macos.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ conda activate llama
*(you needed xcode installed in order pip to build/compile the C++ code)*
```
pip uninstall llama-cpp-python -y
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir
CMAKE_ARGS="-DLLAMA_METAL=on" pip install -U llama-cpp-python --no-cache-dir
pip install 'llama-cpp-python[server]'

# you should now have llama-cpp-python v0.1.62 or higher installed
Expand Down
2 changes: 1 addition & 1 deletion llama_cpp/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from .llama_cpp import *
from .llama import *

from .version import __version__
__version__ = "0.2.0"
Loading