This project deploys a FastAPI-based chatbot using Haystack, containerized with Docker, and deployed using Helm on Minikube with full observability (Prometheus + Grafana) and autoscaling.
Duy Huynh 520644 joshhn Mijung Jung 509822 mijung2024
Install the following tools:
brew install minikube kubectl helm heyminikube start
minikube addons enable metrics-serverdocker build -t edu-chatbot:latest .
minikube image load edu-chatbot:latesthelm upgrade --install edu-chatbot ./edu-chatbotCheck if the pods and service are running:
kubectl get pods
kubectl get svcminikube service edu-chatbot-edu-chatbot --urlExample output:
http://127.0.0.1:52876
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo updatekubectl create namespace monitoringhelm install prometheus prometheus-community/prometheus -n monitoringkubectl create configmap chatbot-dashboard \
--from-file=edu-chatbot/grafana-dashboard.json \
-n monitoringhelm install grafana grafana/grafana \
-n monitoring \
--set adminPassword=admin \
--set service.type=NodePort \
--set dashboardsConfigMaps.default=chatbot-dashboard \
--set datasources."datasources\.yaml".apiVersion=1 \
--set datasources."datasources\.yaml".datasources[0].name=Prometheus \
--set datasources."datasources\.yaml".datasources[0].type=prometheus \
--set datasources."datasources\.yaml".datasources[0].access=proxy \
--set datasources."datasources\.yaml".datasources[0].url=http://prometheus-server.monitoring.svc.cluster.local \
--set datasources."datasources\.yaml".datasources[0].isDefault=trueminikube service grafana -n monitoringLogin:
- Username:
admin - Password:
admin
curl -X POST "http://127.0.0.1:52876/upload/" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "file=@sample_docs/cloud_computing_notes.txt"curl "http://127.0.0.1:52876/query/?question=What+is+FastAPI"hey -z 60s -c 50 "http://127.0.0.1:52876/query/?question=What%20is%20SaaS%3F"-z 60s: Run for 60 seconds-c 50: 50 concurrent users
kubectl get hpa -wYou should see pods increase based on CPU load.
helm uninstall edu-chatbot
helm uninstall grafana -n monitoring
helm uninstall prometheus -n monitoring
kubectl delete namespace monitoring- View Prometheus targets: http://localhost:9090/targets (requires
kubectl port-forward) - Extend HPA to scale by memory or custom metrics via
autoscaling/v2