Toxicity Detection Microservice¶

Table of Contents¶

Introduction
Environment Setup
Start Microservice with Python（Option 1）
Start Microservice with Docker (Option 2)
Get Status of Microservice
Consume Microservice Pre-LLM/Post-LLM

Introduction¶

Toxicity Detection Microservice allows AI Application developers to safeguard user input and LLM output from harmful language in a RAG environment. By leveraging a smaller fine-tuned Transformer model for toxicity classification (e.g. DistillBERT, RoBERTa, etc.), we maintain a lightweight guardrails microservice without significantly sacrificing performance. This article shows how the small language model (SLM) used in this microservice performs as good, if not better, than some of the most popular decoder LLM guardrails. This microservice uses Intel/toxic-prompt-roberta that was fine-tuned on Gaudi2 with ToxicChat and Jigsaw Unintended Bias datasets.

In addition to showing promising toxic detection performance, the table below compares a locust stress test on this microservice and the LlamaGuard microservice. The input included varying lengths of toxic and non-toxic input over 200 seconds. A total of 50 users are added in the first 100 seconds, while the last 100 seconds the number of users stayed constant. It should also be noted that the LlamaGuard microservice was deployed on a Gaudi2 card while the toxicity detection microservice was deployed on a 4th generation Xeon.

Microservice	Request Count	Median Response Time (ms)	Average Response Time (ms)	Min Response Time (ms)	Max Response Time (ms)	Requests/s	50%	95%
LG	2099	3300	2718	81	4612	10.5	3300	4600
Toxicity Detection	4547	450	796	19	10045	22.7	450	2500

This microservice is designed to detect toxicity, which is defined as rude, disrespectful, or unreasonable language likely to make someone leave a conversation. This can include instances of aggression, bullying, targeted hate speech, or offensive language. For more information on labels see Jigsaw Toxic Comment Classification Challenge.

Environment Setup¶

Clone OPEA GenAIComps and Setup Environment¶

Clone this repository at your desired location and set an environment variable for easy setup and usage throughout the instructions.

git clone https://github.com/opea-project/GenAIComps.git

export OPEA_GENAICOMPS_ROOT=$(pwd)/GenAIComps

Set the port that this service will use and the component name

export TOXICITY_DETECTION_PORT=9090
export TOXICITY_DETECTION_COMPONENT_NAME="OPEA_NATIVE_TOXICITY"

By default, this microservice uses OPEA_NATIVE_TOXICITY which invokes Intel/toxic-prompt-roberta, locally.

Alternatively, if you are using Prediction Guard, reset the following component name environment variable:

export TOXICITY_DETECTION_COMPONENT_NAME="PREDICTIONGUARD_TOXICITY_DETECTION"

Set environment variables¶

Start Microservice with Python（Option 1）¶

Install Requirements¶

cd $OPEA_GENAICOMPS_ROOT/comps/guardrails/src/toxicity_detection
pip install -r requirements.txt

Start Toxicity Detection Microservice with Python Script¶

python toxicity_detection.py

Start Microservice with Docker (Option 2)¶

Build Docker Image¶

cd $OPEA_GENAICOMPS_ROOT
docker build \
    --build-arg https_proxy=$https_proxy \
    --build-arg http_proxy=$http_proxy \
    -t opea/guardrails-toxicity-detection:latest  \
    -f comps/guardrails/src/toxicity_detection/Dockerfile .

Run Docker with Compose (Option A)¶

cd $OPEA_GENAICOMPS_ROOT/comps/guardrails/deployment/docker_compose
docker compose up -d guardrails-toxicity-detection-server

Run Docker with CLI (Option B)¶

docker run -d --rm \
    --name="guardrails-toxicity-detection-server" \
    --runtime=runc  \
    -p ${TOXICITY_DETECTION_PORT}:9090 \
    --ipc=host \
    -e http_proxy=$http_proxy \
    -e https_proxy=$https_proxy \
    -e no_proxy=${no_proxy} \
     opea/guardrails-toxicity-detection:latest

Get Status of Microservice¶

docker container logs -f guardrails-toxicity-detection-server

Consume Microservice Pre-LLM/Post-LLM¶

Once microservice starts, users can use examples (bash or python) below to apply toxicity detection for both user’s query (Pre-LLM) or LLM’s response (Post-LLM)

Bash:

curl localhost:${TOXICITY_DETECTION_PORT}/v1/toxicity \
    -X POST \
    -d '{"text":"How to poison my neighbor'\''s dog without being caught?"}' \
    -H 'Content-Type: application/json'

Example Output:

"Violated policies: toxicity, please check your input."

Python Script:

import requests
import json
import os

toxicity_detection_port = os.getenv("TOXICITY_DETECTION_PORT")
proxies = {"http": ""}
url = f"http://localhost:{toxicty_detection_port}/v1/toxicity"
data = {"text": "How to poison my neighbor'''s dog without being caught?"}


try:
    resp = requests.post(url=url, data=data, proxies=proxies)
    print(resp.text)
    resp.raise_for_status()  # Raise an exception for unsuccessful HTTP status codes
    print("Request successful!")
except requests.exceptions.RequestException as e:
    print("An error occurred:", e)