Deployment of the Helm Charts on Intel® Xeon® Processors with Intel® Trust Domain Extensions (Intel® TDX)

This document outlines the deployment process of Helm Charts on Intel® Xeon® Processors where the microservices are protected by Intel TDX with the help of Confidential Containers.

Technical background

Intel Trust Domain Extensions (Intel TDX) is hardware-based trusted execution environment (TEE) that allows the deployment of hardware-isolated virtual machines (VM) designed to protect sensitive data and applications from unauthorized access.

Confidential Containers encapsulates pods inside confidential virtual machines, allowing Cloud Native workloads to leverage confidential computing hardware with minimal modification.

Prerequisites

System Requirements

Category

Details

Operating System

Ubuntu 24.04

Hardware Platforms

4th Gen Intel® Xeon® Scalable processors
5th Gen Intel® Xeon® Scalable processors

Kubernetes Version

1.29+

This guide assumes that:

  1. you are familiar with the regular deployment of the GenAIExamples using Helm Charts,

  2. you have prepared a server with 4th Gen Intel® Xeon® Scalable Processor or later,

  3. you have a single-node Kubernetes cluster already set up on the server for the regular deployment of the GenAIExamples.

Getting Started

Prepare Intel Xeon node

Follow the below steps on the server node with Intel Xeon Processor:

  1. Install Ubuntu 24.04 and enable Intel TDX

  2. Check, if Intel TDX is enabled:

    sudo dmesg | grep -i tdx
    

    The output should show the Intel TDX module version and initialization status:

    virt/tdx: TDX module: attributes 0x0, vendor_id 0x8086, major_version 1, minor_version 5, build_date 20240129, build_num 698
    (...)
    virt/tdx: module initialized
    

    In case the module version or build_num is lower than shown above, please refer to the Intel TDX documentation for update instructions.

  3. Depending on the location of your kubelet config file, increase the kubelet timeout and wait until the node is Ready:

    # Kubespray installation
    echo "runtimeRequestTimeout: 30m" | sudo tee -a /etc/kubernetes/kubelet-config.yaml > /dev/null 2>&1
    # Vanilla Kubernetes installation
    sudo sed -i 's/runtimeRequestTimeout: .*/runtimeRequestTimeout: 30m/' /var/lib/kubelet/config.yaml > /dev/null 2>&1
    # Restart kubelet and wait for the node to be ready
    sudo systemctl daemon-reload && sudo systemctl restart kubelet
    kubectl wait --for=condition=Ready node --all --timeout=2m
    

Prepare the cluster

Follow the steps below on the Kubernetes cluster:

  1. Install Confidential Containers Operator

Deploy the ChatQnA

Follow the steps below to deploy ChatQnA:

  1. Set the environment variables and update the dependencies:

    export HFTOKEN="insert-your-huggingface-token-here"
    export MODELNAME="Intel/neural-chat-7b-v3-3"
    export myrelease=chatqna
    export chartname=chatqna
    ./update_dependency.sh
    helm dependency update $chartname
    
  2. Deploy the Helm Chart setting the tdxEnabled flag for each microservice you want to run using Intel TDX, for example:

    helm install $myrelease $chartname \
       --set global.HUGGINGFACEHUB_API_TOKEN="${HFTOKEN}" --set vllm.LLM_MODEL_ID="${MODELNAME}" \
       --set redis-vector-db.tdxEnabled=true --set redis-vector-db.resources.limits.memory=4Gi \
       --set retriever-usvc.tdxEnabled=true --set retriever-usvc.resources.limits.memory=7Gi \
       --set tei.tdxEnabled=true --set tei.resources.limits.memory=4Gi \
       --set teirerank.tdxEnabled=true --set teirerank.resources.limits.memory=6Gi \
       --set nginx.tdxEnabled=true \
       --set chatqna-ui.tdxEnabled=true --set chatqna-ui.resources.limits.memory=2Gi \
       --set data-prep.tdxEnabled=true --set data-prep.resources.limits.memory=11Gi \
       --set vllm.tdxEnabled=true --set vllm.resources.limits.memory=80Gi
    

[!NOTE] The resources.limits needs to be set when the Intel TDX is used.

By default, each Kubernetes pod will be assigned 1 CPU and 2Gi of memory, but half of it will be used for filesystem.

If the pods fail to start due to lack of disk space, increase the memory limits.