Deployment of the Helm Charts on Intel® Xeon® Processors with Intel® Trust Domain Extensions (Intel® TDX)¶

This document outlines the deployment process of Helm Charts on Intel® Xeon® Processors where the microservices are protected by Intel TDX with the help of Confidential Containers.

Technical background¶

Intel Trust Domain Extensions (Intel TDX) is hardware-based trusted execution environment (TEE) that allows the deployment of hardware-isolated virtual machines (VM) designed to protect sensitive data and applications from unauthorized access.

Confidential Containers encapsulates pods inside confidential virtual machines, allowing Cloud Native workloads to leverage confidential computing hardware with minimal modification.

Prerequisites¶

System Requirements¶

Category	Details
Operating System	Ubuntu 24.04
Hardware Platforms	4th Gen Intel® Xeon® Scalable processors 5th Gen Intel® Xeon® Scalable processors
Kubernetes Version	1.29+

This guide assumes that:

you are familiar with the regular deployment of the GenAIExamples using Helm Charts,
you have prepared a server with 4th Gen Intel® Xeon® Scalable Processor or later,
you have a single-node Kubernetes cluster already set up on the server for the regular deployment of the GenAIExamples.

Getting Started¶

Prepare Intel Xeon node¶

Follow the below steps on the server node with Intel Xeon Processor:

Install Ubuntu 24.04 and enable Intel TDX
Check, if Intel TDX is enabled:
```
sudo dmesg | grep -i tdx
```
The output should show the Intel TDX module version and initialization status:
```
virt/tdx: TDX module: attributes 0x0, vendor_id 0x8086, major_version 1, minor_version 5, build_date 20240129, build_num 698
(...)
virt/tdx: module initialized
```
In case the module version or build_num is lower than shown above, please refer to the Intel TDX documentation for update instructions.

Depending on the location of your kubelet config file, increase the kubelet timeout and wait until the node is Ready:

# Kubespray installation
echo "runtimeRequestTimeout: 30m" | sudo tee -a /etc/kubernetes/kubelet-config.yaml > /dev/null 2>&1
# Vanilla Kubernetes installation
sudo sed -i 's/runtimeRequestTimeout: .*/runtimeRequestTimeout: 30m/' /var/lib/kubelet/config.yaml > /dev/null 2>&1
# Restart kubelet and wait for the node to be ready
sudo systemctl daemon-reload && sudo systemctl restart kubelet
kubectl wait --for=condition=Ready node --all --timeout=2m

Prepare the cluster¶

Follow the steps below on the Kubernetes cluster:

Install Confidential Containers Operator

Deploy the ChatQnA¶

Follow the steps below to deploy ChatQnA:

Set the environment variables and update the dependencies:

export HFTOKEN="insert-your-huggingface-token-here"
export MODELNAME="Intel/neural-chat-7b-v3-3"
export myrelease=chatqna
export chartname=chatqna
scripts/update_dependency.sh
helm dependency update $chartname

Deploy the Helm Chart setting the tdxEnabled flag for each microservice you want to run using Intel TDX, for example:

helm install $myrelease $chartname \
   --set global.HF_TOKEN="${HFTOKEN}" --set vllm.LLM_MODEL_ID="${MODELNAME}" \
   --set redis-vector-db.tdxEnabled=true --set redis-vector-db.resources.limits.memory=4Gi \
   --set retriever-usvc.tdxEnabled=true --set retriever-usvc.resources.limits.memory=7Gi \
   --set tei.tdxEnabled=true --set tei.resources.limits.memory=4Gi \
   --set teirerank.tdxEnabled=true --set teirerank.resources.limits.memory=6Gi \
   --set nginx.tdxEnabled=true \
   --set chatqna-ui.tdxEnabled=true --set chatqna-ui.resources.limits.memory=2Gi \
   --set data-prep.tdxEnabled=true --set data-prep.resources.limits.memory=11Gi \
   --set vllm.tdxEnabled=true --set vllm.resources.limits.memory=80Gi

[!NOTE] The resources.limits needs to be set when the Intel TDX is used.

By default, each Kubernetes pod will be assigned 1 CPU and 2Gi of memory, but half of it will be used for filesystem.

If the pods fail to start due to lack of disk space, increase the memory limits.