Deployment of the Helm Charts on Intel® Xeon® Processors with Intel® Trust Domain Extensions (Intel® TDX)¶
This document outlines the deployment process of Helm Charts on Intel® Xeon® Processors where the microservices are protected by Intel TDX with the help of Confidential Containers.
Technical background¶
Intel Trust Domain Extensions (Intel TDX) is hardware-based trusted execution environment (TEE) that allows the deployment of hardware-isolated virtual machines (VM) designed to protect sensitive data and applications from unauthorized access.
Confidential Containers encapsulates pods inside confidential virtual machines, allowing Cloud Native workloads to leverage confidential computing hardware with minimal modification.
Prerequisites¶
System Requirements¶
Category |
Details |
---|---|
Operating System |
Ubuntu 24.04 |
Hardware Platforms |
4th Gen Intel® Xeon® Scalable processors |
Kubernetes Version |
1.29+ |
This guide assumes that:
you are familiar with the regular deployment of the GenAIExamples using Helm Charts,
you have prepared a server with 4th Gen Intel® Xeon® Scalable Processor or later,
you have a single-node Kubernetes cluster already set up on the server for the regular deployment of the GenAIExamples.
Getting Started¶
Prepare Intel Xeon node¶
Follow the below steps on the server node with Intel Xeon Processor:
Check, if Intel TDX is enabled:
sudo dmesg | grep -i tdx
The output should show the Intel TDX module version and initialization status:
virt/tdx: TDX module: attributes 0x0, vendor_id 0x8086, major_version 1, minor_version 5, build_date 20240129, build_num 698 (...) virt/tdx: module initialized
In case the module version or
build_num
is lower than shown above, please refer to the Intel TDX documentation for update instructions.Depending on the location of your kubelet config file, increase the kubelet timeout and wait until the node is
Ready
:# Kubespray installation echo "runtimeRequestTimeout: 30m" | sudo tee -a /etc/kubernetes/kubelet-config.yaml > /dev/null 2>&1 # Vanilla Kubernetes installation sudo sed -i 's/runtimeRequestTimeout: .*/runtimeRequestTimeout: 30m/' /var/lib/kubelet/config.yaml > /dev/null 2>&1 # Restart kubelet and wait for the node to be ready sudo systemctl daemon-reload && sudo systemctl restart kubelet kubectl wait --for=condition=Ready node --all --timeout=2m
Prepare the cluster¶
Follow the steps below on the Kubernetes cluster:
Deploy the ChatQnA¶
Follow the steps below to deploy ChatQnA:
Set the environment variables and update the dependencies:
export HFTOKEN="insert-your-huggingface-token-here" export MODELNAME="Intel/neural-chat-7b-v3-3" export myrelease=chatqna export chartname=chatqna ./update_dependency.sh helm dependency update $chartname
Deploy the Helm Chart setting the
tdxEnabled
flag for each microservice you want to run using Intel TDX, for example:helm install $myrelease $chartname \ --set global.HUGGINGFACEHUB_API_TOKEN="${HFTOKEN}" --set vllm.LLM_MODEL_ID="${MODELNAME}" \ --set redis-vector-db.tdxEnabled=true --set redis-vector-db.resources.limits.memory=4Gi \ --set retriever-usvc.tdxEnabled=true --set retriever-usvc.resources.limits.memory=7Gi \ --set tei.tdxEnabled=true --set tei.resources.limits.memory=4Gi \ --set teirerank.tdxEnabled=true --set teirerank.resources.limits.memory=6Gi \ --set nginx.tdxEnabled=true \ --set chatqna-ui.tdxEnabled=true --set chatqna-ui.resources.limits.memory=2Gi \ --set data-prep.tdxEnabled=true --set data-prep.resources.limits.memory=11Gi \ --set vllm.tdxEnabled=true --set vllm.resources.limits.memory=80Gi
[!NOTE] The
resources.limits
needs to be set when the Intel TDX is used.By default, each Kubernetes pod will be assigned
1
CPU and2Gi
of memory, but half of it will be used for filesystem.If the pods fail to start due to lack of disk space, increase the memory limits.