Introduction

This OPEA text generation service can connect to any OpenAI-compatible API endpoint, including local deployments (like vLLM or TGI) and remote services (like OpenRouter.ai).

1 Prepare TextGen docker image.

# Build the microservice docker

git clone https://github.com/opea-project/GenAIComps
cd GenAIComps

docker build \
  --no-cache \
  --build-arg https_proxy=$https_proxy \
  --build-arg http_proxy=$http_proxy \
  -t opea/llm-textgen:latest \
  -f comps/llms/src/text-generation/Dockerfile .

2 Setup Environment Variables

The key environment variable is LLM_ENDPOINT, which specifies the URL of the OpenAI-compatible API. This can be a local address (e.g., for vLLM or TGI) or a remote address.

export host_ip=$(hostname -I | awk '{print $1}')
export LLM_MODEL_ID="" # e.g. "google/gemma-3-1b-it:free"
export LLM_ENDPOINT=""  # e.g., "http://localhost:8000" (for local vLLM) or "https://openrouter.ai/api" (please make sure to omit /v1 suffix)
export OPENAI_API_KEY=""

3 Run the Textgen Service

export service_name="textgen-service-endpoint-openai"
docker compose -f comps/llms/deployment/docker_compose/compose_text-generation.yaml up ${service_name} -d

To observe logs:

docker logs textgen-service-endpoint-openai

4 Test the service

You can first test the remote/local endpoint with curl. If you’re using a service like OpenRouter, you can test it directly first:

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
  "model": "'${LLM_MODEL_ID}'",
  "messages": [
    {
      "role": "user",
      "content": "Tell me a joke?"
    }
  ]
}'

Then you can test the OPEA text generation service that wrapped the endpoint, with the following:

curl http://localhost:9000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"'${LLM_MODEL_ID}'","messages":[{"role":"user","content":"Tell me a joke?"}]}'