lvm-uservice¶

Helm chart for deploying LVM microservice.

Installing the chart¶

lvm-uservice depends on one of the following backend services:

TGI: please refer to tgi chart for more information
vLLM: please refer to vllm chart for more information
one of the large vision model inference engine: please refer to lvm-serve chart for more information

First, you need to get the dependent service deployed, i.e. deploy the tgi helm chart, vllm helm chart or lvm helm chart.

After you’ve deployed the dependent service successfully, please run kubectl get svc to get the backend service URL, e.g. http://tgi, http://myvllm or http://lvm-serve.

To install the lvm-uservice chart, run the following:

cd GenAIInfra/helm-charts/common/lvm-uservice
helm dependency update
export HFTOKEN="insert-your-huggingface-token-here"

# Use vLLM as the backend
export LLM_MODEL_ID="model-id-used-for-vllm"
export LVM_BACKEND="vLLM"
export LVM_ENDPOINT="http://myvllm"
helm install lvm-uservice . --set global.HF_TOKEN=${HFTOKEN} --set LLM_MODEL_ID=${LLM_MODEL_ID} --set LVM_BACKEND=${LVM_BACKEND} --set LVM_ENDPOINT=${LVM_ENDPOINT} --wait

# Use TGI as the backend
# export LVM_BACKEND="TGI"
# export LVM_ENDPOINT="http://tgi"
# helm install lvm-uservice . --set global.HF_TOKEN=${HFTOKEN} --set LVM_BACKEND=${LVM_BACKEND} --set LVM_ENDPOINT=${LVM_ENDPOINT} --wait

# Use other lvm-serve engine variant as the backend, see file `values.yaml` more details
# export LVM_ENDPOINT="http://lvm-serve"
# export LVM_BACKEND="LLaVA"
# helm install lvm-uservice . --set global.HF_TOKEN=${HFTOKEN} --set LVM_BACKEND=${LVM_BACKEND} --set LVM_ENDPOINT=${LVM_ENDPOINT} --wait

Verify¶

To verify the installation, run the command kubectl get pod to make sure all pods are running.

Then run the command kubectl port-forward svc/lvm-uservice 9399:9399 to expose the lvm-uservice service for access.

Open another terminal and run the following command to verify the service if working:

curl http://localhost:9000/v1/lvm \
    -X POST \
    -d '{"image": "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8/5+hnoEIwDiqkL4KAcT9GO0U4BxoAAAAAElFTkSuQmCC", "prompt":"What is this?"}' \
    -H 'Content-Type: application/json'

Values¶

Key	Type	Default	Description
global.HF_TOKEN	string	`""`	Your own Hugging Face API token
LVM_BACKEND	string	`"vLLM"`	lvm backend engine, possible value “vLLM”, “TGI”, “LLaVA”, “VideoLlama”, “LlamaVision”, “PredictionGuard”
LVM_ENDPOINT	string	`""`	LVM endpoint
global.monitoring	bool	`false`	Service usage metrics