OPEA applications GCP GKE deployment guide¶
This guide shows how to deploy OPEA applications on Google Cloud Platform (GCP) Google Kubernetes Engine (GKE) using Terraform.
Prerequisites¶
Setup¶
The setup uses Terraform to create GKE cluster with the following properties:
1-node GKE cluster with 100 GB disk and
n4-standard-8
preemptible SPOT instance (8 vCPU and 32 GB memory)Cluster autoscaling up to 5 nodes
Pre GKE Cluster setup
After you’ve installed the gcloud SDK, initialize it by running the following command.
gcloud init
This will authorize the SDK to access GCP using your user account credentials and add the SDK to your PATH. This steps requires you to login and select the project you want to work in. Finally, add your account to the Application Default Credentials (ADC). This will allow Terraform to access these credentials to provision resources on GCloud.
gcloud auth application-default login
In here, you will find four files used to provision a VPC, subnets and a GKE cluster.
vpc.tf provisions a VPC and subnet. A new VPC is created for this tutorial so it doesn’t impact your existing cloud environment and resources. This file outputs region.
main.tf provisions a GKE cluster and a separately managed node pool (recommended). Separately managed node pools allows you to customize your Kubernetes cluster profile — this is useful if some Pods require more resources than others. You can learn more here. The number of nodes in the node pool is defined also defined here.
opea-chatqna.tfvars is a template for the project_id, cluster_name and region variables.
versions.tf sets the Terraform version to at least 0.14.
Update your opea-chatqna.tfvars file¶
Replace the values in your opea-chatqna.tfvars file with your project_id, cluster_name and region. Terraform will use these values to target your project when provisioning your resources. Your opea-chatqna.tfvars file should look like the following.
# opea-chatqna.tfvars
project_id = "REPLACE_ME"
region = "us-central1"
You can find the project your gcloud is configured to with this command.
gcloud config get-value project
The region has been defaulted to us-central1; you can find a full list of gcloud regions - https://cloud.google.com/compute/docs/regions-zones
Initialize the Terraform environment.
terraform init
GKE cluster¶
By default, 1-node cluster is created which is suitable for running the OPEA application. See main.tf
upto max_node_count = 5, if you want to tune the cluster properties, e.g., number of nodes, instance types or disk size.
Persistent Volume Claim¶
OPEA needs a volume where to store the model. For that we need to create Kubernetes Persistent Volume Claim (PVC). OPEA requires ReadWriteOnce
option since multiple pods needs access to the storage and they can be on different nodes. On GKE, We are installing Storage Class that support n4-standard-8 which is hyper-balanced . Thus, each OPEA application below uses the file eks-fs-pvc.yaml
to create Storage Class and PVC in its namespace.
OPEA Applications¶
ChatQnA¶
Use the commands below to create GKE cluster.
terraform plan --var-file opea-chatqna.tfvars -out opea-chatqna.plan
terraform apply "opea-chatqna.plan"
Once the cluster is ready, update kubectl config
gcloud container clusters get-credentials "cluster_name"-gke --region us-central1 --project "project_id"
Now you should have access to the cluster via the kubectl
command.
Deploy ChatQnA Application with Helm
helm install -n chatqna --create-namespace chatqna oci://ghcr.io/opea-project/charts/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN}
Create the Storage Class and PVC as mentioned above
kubectl apply -f gke-fs-pvc.yaml -n chatqna
After a while, the OPEA application should be running. You can check the status via kubectl
.
kubectl get pod -n chatqna
You can now start using the OPEA application.
OPEA_SERVICE=$(kubectl get svc -n chatqna chatqna -ojsonpath='{.status.loadBalancer.ingress[0].hostname}')
curl http://${OPEA_SERVICE}:8888/v1/chatqna \
-H "Content-Type: application/json" \
-d '{"messages": "What is the revenue of Nike in 2023?"}'
Cleanup
Delete the cluster via the following command.
helm uninstall -n chatqna chatqna
terraform destroy -var-file opea-chatqna.tfvars