# OPEA applications GCP GKE deployment guide

This guide shows how to deploy OPEA applications on Google Cloud Platform (GCP) Google Kubernetes Engine (GKE) using Terraform.

## Prerequisites

- Access to GCP GKE
- [Terraform](https://developer.hashicorp.com/terraform/tutorials/gcp-get-started/install-cli), [GCP CLI](https://cloud.google.com/sdk/docs/install-sdk) and [Helm](https://helm.sh/docs/helm/helm_install/),[kubectl](https://kubernetes.io/docs/tasks/tools/) installed on your local machine.

## Setup

The setup uses Terraform to create GKE cluster with the following properties:

- 1-node GKE cluster with 100 GB disk and `n4-standard-8` preemptible SPOT instance (8 vCPU and 32 GB memory)
- Cluster autoscaling up to 5 nodes

Pre GKE Cluster setup

- After you've installed the gcloud SDK, initialize it by running the following command.

```bash
gcloud init
```

- This will authorize the SDK to access GCP using your user account credentials and add the SDK to your PATH. This steps requires you to login and select the project you want to work in. Finally, add your account to the Application Default Credentials (ADC). This will allow Terraform to access these credentials to provision resources on GCloud.

```bash
gcloud auth application-default login
```

In here, you will find four files used to provision a VPC, subnets and a GKE cluster.

- vpc.tf provisions a VPC and subnet. A new VPC is created for this tutorial so it doesn't impact your existing cloud environment and resources. This file outputs region.

- main.tf provisions a GKE cluster and a separately managed node pool (recommended). Separately managed node pools allows you to customize your Kubernetes cluster profile — this is useful if some Pods require more resources than others. You can learn more here. The number of nodes in the node pool is defined also defined here.

- opea-chatqna.tfvars is a template for the project_id, cluster_name and region variables.

- versions.tf sets the Terraform version to at least 0.14.

## Update your opea-chatqna.tfvars file

Replace the values in your opea-chatqna.tfvars file with your project_id, cluster_name and region. Terraform will use these values to target your project when provisioning your resources. Your opea-chatqna.tfvars file should look like the following.

```bash
 # opea-chatqna.tfvars
  project_id = "REPLACE_ME"
  region     = "us-central1"
```

You can find the project your gcloud is configured to with this command.

```bash
 gcloud config get-value project
```

The region has been defaulted to us-central1; you can find a full list of gcloud regions - https://cloud.google.com/compute/docs/regions-zones

Initialize the Terraform environment.

```bash
terraform init
```

## GKE cluster

By default, 1-node cluster is created which is suitable for running the OPEA application. See `main.tf` upto max_node_count = 5, if you want to tune the cluster properties, e.g., number of nodes, instance types or disk size.

## Persistent Volume Claim

OPEA needs a volume where to store the model. For that we need to create Kubernetes Persistent Volume Claim (PVC). OPEA requires `ReadWriteOnce` option since multiple pods needs access to the storage and they can be on different nodes. On GKE, We are installing Storage Class that support n4-standard-8 which is hyper-balanced . Thus, each OPEA application below uses the file `eks-fs-pvc.yaml` to create Storage Class and PVC in its namespace.

## OPEA Applications

### ChatQnA

Use the commands below to create GKE cluster.

```bash
terraform plan --var-file opea-chatqna.tfvars -out opea-chatqna.plan
terraform apply "opea-chatqna.plan"
```

Once the cluster is ready, update kubectl config

```bash
gcloud container clusters get-credentials "cluster_name"-gke --region us-central1 --project "project_id"
```

Now you should have access to the cluster via the `kubectl` command.

Deploy ChatQnA Application with Helm

```bash
helm install -n chatqna --create-namespace chatqna oci://ghcr.io/opea-project/charts/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HF_TOKEN=${HFTOKEN}
```

Create the Storage Class and PVC as mentioned [above](#persistent-volume-claim)

```bash
kubectl apply -f gke-fs-pvc.yaml -n chatqna
```

After a while, the OPEA application should be running. You can check the status via `kubectl`.

```bash
kubectl get pod -n chatqna
```

You can now start using the OPEA application.

```bash
OPEA_SERVICE=$(kubectl get svc -n chatqna chatqna -ojsonpath='{.status.loadBalancer.ingress[0].hostname}')
curl http://${OPEA_SERVICE}:8888/v1/chatqna \
    -H "Content-Type: application/json" \
    -d '{"messages": "What is the revenue of Nike in 2023?"}'
```

Cleanup

Delete the cluster via the following command.

```bash
helm uninstall -n chatqna chatqna
terraform destroy -var-file opea-chatqna.tfvars
```