Guardrails Microservice

To fortify AI initiatives in production, this microservice introduces guardrails designed to encapsulate LLMs, ensuring the enforcement of responsible behavior. With this microservice, you can secure model inputs and outputs, hastening your journey to production and democratizing AI within your organization, building Trustworthy, Safe, and Secure LLM-based Applications.

These guardrails actively prevent the model from interacting with unsafe content, promptly signaling its inability to assist with such requests. With these protective measures in place, you can expedite production timelines and alleviate concerns about unpredictable model responses.

The Guardrails Microservice now offers two primary types of guardrails:

  • Input Guardrails: These are applied to user inputs. An input guardrail can either reject the input, halting further processing.

  • Output Guardrails: These are applied to outputs generated by the LLM. An output guardrail can reject the output, preventing it from being returned to the user.

This microservice supports Meta’s Llama Guard and Allen Institute for AI’s WildGuard models.

Llama Guard

Any content that is detected in the following categories is determined as unsafe:

  • Violence and Hate

  • Sexual Content

  • Criminal Planning

  • Guns and Illegal Weapons

  • Regulated or Controlled Substances

  • Suicide & Self Harm

WildGuard

allenai/wildguard was fine-tuned from mistralai/Mistral-7B-v0.3 on their own allenai/wildguardmix dataset. Any content that is detected in the following categories is determined as unsafe:

  • Privacy

  • Misinformation

  • Harmful Language

  • Malicious Uses

Clone OPEA GenAIComps and set initial environment variables

git clone https://github.com/opea-project/GenAIComps.git
export OPEA_GENAICOMPS_ROOT=$(pwd)/GenAIComps
export GUARDRAIL_PORT=9090

Start up the HuggingFace Text Generation Inference (TGI) Server

Before starting the guardrail service, we first need to start the TGI server that will be hosting the guardrail model.

Choose one of the following before starting your TGI server.

For LlamaGuard:

export SAFETY_GUARD_MODEL_ID="meta-llama/Meta-Llama-Guard-2-8B"
export GUARDRAILS_COMPONENT_NAME=OPEA_LLAMA_GUARD

Or

export SAFETY_GUARD_MODEL_ID="meta-llama/LlamaGuard-7b"
export GUARDRAILS_COMPONENT_NAME=OPEA_LLAMA_GUARD

Other variations of LlamaGuard are also an option to use but are not guaranteed to work OOB.

For Wild Guard:

export SAFETY_GUARD_MODEL_ID="allenai/wildguard"
export GUARDRAILS_COMPONENT_NAME=OPEA_WILD_GUARD

Note that both of these models are gated and you need to complete their form on their associated model pages first in order to use them with your HF token.

Follow the steps here to start the TGI server container where LLM_MODEL_ID is set to your SAFETY_GUARD_MODEL_ID like below:

export LLM_MODEL_ID=$SAFETY_GUARD_MODEL_ID

Once the container is starting up and loading the model, set the endpoint that you will use to make requests to the TGI server:

export SAFETY_GUARD_ENDPOINT="http://${host_ip}:${LLM_ENDPOINT_PORT}"

Verify that the TGI Server is ready for inference

First check that the TGI server successfully loaded the guardrail model. Loading the model could take up to 5-10 minutes. You can do this by running the following:

docker logs tgi-gaudi-server

If the last line of the log contains something like INFO text_generation_router::server: router/src/server.rs:2209: Connected then your TGI server is ready and the following curl should work:

curl localhost:${LLM_ENDPOINT_PORT}/generate \
  -X POST \
  -d '{"inputs":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
  -H 'Content-Type: application/json'

Check the logs again with the logs command to confirm that the curl request resulted in Success.

🚀1. Start Microservice with Python (Option 1)

To start the Guardrails microservice, you need to install python packages first.

1.1 Install Requirements

pip install $OPEA_GENAICOMPS_ROOT
cd $OPEA_GENAICOMPS_ROOT/comps/guardrails/src/guardrails
pip install -r requirements.txt

1.2 Start Guardrails Service

python opea_guardrails_microservice.py

🚀2. Start Microservice with Docker (Option 2)

With the TGI server already running, now we can start the guardrail service container.

2.1 Build Docker Image

cd $OPEA_GENAICOMPS_ROOT
docker build -t opea/guardrails:latest \
  --build-arg https_proxy=$https_proxy \
  --build-arg http_proxy=$http_proxy \
  -f comps/guardrails/src/guardrails/Dockerfile .

2.2.a Run with Docker Compose (Option A)

To run with LLama Guard:

docker compose -f $OPEA_GENAICOMPS_ROOT/comps/guardrails/deployment/docker_compose/compose.yaml up -d llamaguard-guardrails-server

To run with WildGuard:

docker compose -f $OPEA_GENAICOMPS_ROOT/comps/guardrails/deployment/docker_compose/compose.yaml up -d wildguard-guardrails-server

2.2.b Run Docker with CLI (Option B)

To run with LLama Guard:

docker run -d \
  --name="llamaguard-guardrails-server" \
  -p ${GUARDRAIL_PORT}:${GUARDRAIL_PORT} \
  --ipc=host \
  -e http_proxy=$http_proxy \
  -e https_proxy=$https_proxy \
  -e no_proxy=$no_proxy \
  -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT \
  -e HUGGINGFACEHUB_API_TOKEN=$HF_TOKEN \
  opea/guardrails:latest

To run with WildGuard:

docker run -d \
  --name="wildguard-guardrails-server" \
  -p ${GUARDRAIL_PORT}:${GUARDRAIL_PORT} \
  --ipc=host \
  -e http_proxy=$http_proxy \
  -e https_proxy=$https_proxy \
  -e no_proxy=$no_proxy \
  -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT \
  -e HUGGINGFACEHUB_API_TOKEN=$HF_TOKEN \
  -e GUARDRAILS_COMPONENT_NAME="OPEA_WILD_GUARD" \
  opea/guardrails:latest

🚀3. Consume Guardrails Service

3.1 Check Service Status

curl http://localhost:${GUARDRAIL_PORT}/v1/health_check\
  -X GET \
  -H 'Content-Type: application/json'

3.2 Consume Guardrails Service

curl http://localhost:${GUARDRAIL_PORT}/v1/guardrails\
  -X POST \
  -d '{"text":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
  -H 'Content-Type: application/json'

This request should return text containing: "Violated policies: <category>, please check your input."

Where category is Violent Crimes or harmful for Llama-Guard-2-8B or wildguard, respectively.