Kubernetes Deployment for Cogniware IMS¶
This directory contains Kubernetes deployment configurations for the OPEA Cogniware Inventory Management System.
Deployment Options¶
Option 1: Helm Chart Deployment¶
Note: Publish the chart to GHCR before running CI. See
helm/PUBLISH_CHART.mdfor instructions.
Prerequisites¶
Kubernetes cluster (v1.24+)
Helm 3.0+
kubectl configured
Installation¶
# Create namespace
kubectl create namespace opea
# Install Cogniware IMS
helm install cogniwareims ./helm \
--namespace opea \
--set global.HUGGINGFACEHUB_API_TOKEN=<your-token>
# Check deployment
kubectl get pods -n opea
kubectl get svc -n opea
# Access the application
kubectl port-forward -n opea svc/cogniwareims-ui 3000:3000
Configuration¶
Customize deployment by editing helm/values.yaml or using --set:
helm install cogniwareims ./helm \
--namespace opea \
--set global.HUGGINGFACEHUB_API_TOKEN=<your-token> \
--set cogniwareims-ui.service.type=LoadBalancer
Upgrading¶
helm upgrade cogniwareims ./helm --namespace opea
Uninstalling¶
helm uninstall cogniwareims --namespace opea
Option 2: GMC (GenAI Microservices Connector) Deployment¶
Prerequisites¶
Kubernetes cluster with GMC installed
kubectl configured
Installation¶
# Install GMC (if not installed)
kubectl apply -f https://github.com/opea-project/GenAIInfra/releases/download/v1.0/gmc.yaml
# Deploy Cogniware IMS
kubectl apply -f gmc/cogniwareims.yaml
# Verify
kubectl get gmconnector cogniwareims
kubectl get pods -l app=cogniwareims
Architecture¶
The Kubernetes deployment includes:
Frontend: Next.js application (Port 3000)
Backend: FastAPI with megaservice orchestration (Port 8000)
PostgreSQL: Relational database (Port 5432)
Redis: Vector store and cache (Port 6379)
OPEA Microservices:
TGI Service (Port 80)
LLM Microservice (Port 9000)
Embedding Service (Port 6000)
Retriever Service (Port 7000)
Reranking Service (Port 8000)
DataPrep Service (Port 6007)
Monitoring¶
Health Checks¶
# Check backend
kubectl exec -it <backend-pod> -n opea -- curl http://localhost:8000/api/health
# Check services
kubectl get pods -n opea
kubectl describe pod <pod-name> -n opea
Logs¶
# View backend logs
kubectl logs -f deployment/cogniwareims-backend -n opea
# View UI logs
kubectl logs -f deployment/cogniwareims-ui -n opea
Scaling¶
Horizontal Pod Autoscaling¶
kubectl autoscale deployment cogniwareims-backend \
--namespace opea \
--cpu-percent=70 \
--min=2 \
--max=10
Manual Scaling¶
kubectl scale deployment cogniwareims-backend --replicas=3 -n opea
Troubleshooting¶
Pods Not Starting¶
kubectl describe pod <pod-name> -n opea
kubectl get events -n opea --sort-by='.lastTimestamp'
Service Connectivity¶
kubectl get svc -n opea
kubectl exec -it <pod-name> -n opea -- curl http://service-name:port/health