This is an AI Assistant application that uses FastAPI, React, and the llama.cpp library, which uses Meta's LLaMa large language model.. The llama.cpp library provides a lightweight implementation of the OpenAI GPT-3-like models. The application automatically pulls the llama.cpp repo, as well as the models and weights.
The backend is implemented using FastAPI. It has a single endpoint /chat for interacting with the AI Assistant. The application automatically downloads the required models and weights for the specified model size during startup.
To run the backend, first install the required dependencies:
pip install -r requirements.txtThen, start the FastAPI server:
uvicorn main:app --host 0.0.0.0 --port 5437The frontend is a basic React application that interacts with the backend using the /chat endpoint. It shows a loading symbol while waiting for a reply from the FastAPI server.
First, make sure you have Node.js and yarn installed. Then, create a new React app:
cd ai-assistant-front-end
yarn installRun the following command to start the development server:
yarn startThis will start the development server, and you can access the React app at http://localhost:3000.
You can also run the AI Assistant using Docker. A Dockerfile and docker-compose.yml file are provided for this purpose. To run the application using Docker, execute the following command:
docker-compose up --buildThis will build the Docker image and start the container. You can then access the AI Assistant API at http://localhost:5437/docs and front end at http://localhost:3000.