Skip to content

trzy/llava-cpp-server

Repository files navigation

LLaVA C++ Server

Bart Trzynadlowski, 2023

Simple API server for llama.cpp implementation of LLaVA.

Usage

Download one of ggml-model-*.gguf and mmproj-model-f16.gguf from here. Then, simply invoke:

bin/llava-server -m ggml-model-q5_k.gguf --mmproj mmproj-model-f16.gguf

This will start a server on localhost:8080. You can change the hostname and port with --host and --port, respectively, and enable HTTP logging with --log-http. You should be able to interact with the server at localhost:8080 in a web browser.

API

The LLaVA endpoint is at /llava. The request body takes the following parameters:

Name Type Required Description
user_prompt string yes The prompt (e.g., "what is this?")
image_file file yes Image data in binary form.
system_prompt string no System prompt.

Build Instructions

The llama.cpp and cpp-httplib repositories are included as gitmodules. After cloning, make sure to first run:

git submodule init
git submodule update

Then to build, simply run:

make

So far, this has only been tested on macOS, but should work anywhere else llama.cpp builds.

About

LLaVA server (llama.cpp).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published