# Toxicity Detection Microservice

## Table of Contents

- [Introduction](#introduction)
- [Environment Setup](#environment-setup)
- [Start Microservice with Python（Option 1）](#start-microservice-with-pythonoption-1)
- [Start Microservice with Docker (Option 2)](#start-microservice-with-docker-option-2)
- [Get Status of Microservice](#get-status-of-microservice)
- [Consume Microservice Pre-LLM/Post-LLM](#consume-microservice-pre-llmpost-llm)

## Introduction

Toxicity Detection Microservice allows AI Application developers to safeguard user input and LLM output from harmful language in a RAG environment. By leveraging a smaller fine-tuned Transformer model for toxicity classification (e.g. DistillBERT, RoBERTa, etc.), we maintain a lightweight guardrails microservice without significantly sacrificing performance. This [article](https://huggingface.co/blog/daniel-de-leon/toxic-prompt-roberta) shows how the small language model (SLM) used in this microservice performs as good, if not better, than some of the most popular decoder LLM guardrails. This microservice uses [`Intel/toxic-prompt-roberta`](https://huggingface.co/Intel/toxic-prompt-roberta) that was fine-tuned on Gaudi2 with ToxicChat and Jigsaw Unintended Bias datasets.

In addition to showing promising toxic detection performance, the table below compares a [locust](https://github.com/locustio/locust) stress test on this microservice and the [LlamaGuard microservice](/GenAIComps/comps/guardrails/src/guardrails/README.md#LlamaGuard). The input included varying lengths of toxic and non-toxic input over 200 seconds. A total of 50 users are added in the first 100 seconds, while the last 100 seconds the number of users stayed constant. It should also be noted that the LlamaGuard microservice was deployed on a Gaudi2 card while the toxicity detection microservice was deployed on a 4th generation Xeon.

| Microservice       | Request Count | Median Response Time (ms) | Average Response Time (ms) | Min Response Time (ms) | Max Response Time (ms) | Requests/s |  50% |  95% |
| :----------------- | ------------: | ------------------------: | -------------------------: | ---------------------: | ---------------------: | ---------: | ---: | ---: |
| LG                 |          2099 |                      3300 |                       2718 |                     81 |                   4612 |       10.5 | 3300 | 4600 |
| Toxicity Detection |          4547 |                       450 |                        796 |                     19 |                  10045 |       22.7 |  450 | 2500 |

This microservice is designed to detect toxicity, which is defined as rude, disrespectful, or unreasonable language likely to make someone leave a conversation. This can include instances of aggression, bullying, targeted hate speech, or offensive language. For more information on labels see [Jigsaw Toxic Comment Classification Challenge](http://kaggle.com/c/jigsaw-toxic-comment-classification-challenge).

## Environment Setup

### Clone OPEA GenAIComps and Setup Environment

Clone this repository at your desired location and set an environment variable for easy setup and usage throughout the instructions.

```bash
git clone https://github.com/opea-project/GenAIComps.git

export OPEA_GENAICOMPS_ROOT=$(pwd)/GenAIComps
```

Set the port that this service will use and the component name

```
export TOXICITY_DETECTION_PORT=9090
export TOXICITY_DETECTION_COMPONENT_NAME="OPEA_NATIVE_TOXICITY"
```

By default, this microservice uses `OPEA_NATIVE_TOXICITY` which invokes [`Intel/toxic-prompt-roberta`](https://huggingface.co/Intel/toxic-prompt-roberta), locally.

Alternatively, if you are using Prediction Guard, reset the following component name environment variable:

```
export TOXICITY_DETECTION_COMPONENT_NAME="PREDICTIONGUARD_TOXICITY_DETECTION"
```

### Set environment variables

## Start Microservice with Python（Option 1）

### Install Requirements

```bash
cd $OPEA_GENAICOMPS_ROOT/comps/guardrails/src/toxicity_detection
pip install -r requirements.txt
```

### Start Toxicity Detection Microservice with Python Script

```bash
python toxicity_detection.py
```

## Start Microservice with Docker (Option 2)

### Build Docker Image

```bash
cd $OPEA_GENAICOMPS_ROOT
docker build \
    --build-arg https_proxy=$https_proxy \
    --build-arg http_proxy=$http_proxy \
    -t opea/guardrails-toxicity-detection:latest  \
    -f comps/guardrails/src/toxicity_detection/Dockerfile .
```

### Run Docker with Compose (Option A)

```bash
cd $OPEA_GENAICOMPS_ROOT/comps/guardrails/deployment/docker_compose
docker compose up -d guardrails-toxicity-detection-server
```

### Run Docker with CLI (Option B)

```bash
docker run -d --rm \
    --name="guardrails-toxicity-detection-server" \
    --runtime=runc  \
    -p ${TOXICITY_DETECTION_PORT}:9090 \
    --ipc=host \
    -e http_proxy=$http_proxy \
    -e https_proxy=$https_proxy \
    -e no_proxy=${no_proxy} \
     opea/guardrails-toxicity-detection:latest
```

## Get Status of Microservice

```bash
docker container logs -f guardrails-toxicity-detection-server
```

## Consume Microservice Pre-LLM/Post-LLM

Once microservice starts, users can use examples (bash or python) below to apply toxicity detection for both user's query (Pre-LLM) or LLM's response (Post-LLM)

**Bash:**

```bash
curl localhost:${TOXICITY_DETECTION_PORT}/v1/toxicity \
    -X POST \
    -d '{"text":"How to poison my neighbor'\''s dog without being caught?"}' \
    -H 'Content-Type: application/json'
```

Example Output:

```bash
"Violated policies: toxicity, please check your input."
```

**Python Script:**

```python
import requests
import json
import os

toxicity_detection_port = os.getenv("TOXICITY_DETECTION_PORT")
proxies = {"http": ""}
url = f"http://localhost:{toxicty_detection_port}/v1/toxicity"
data = {"text": "How to poison my neighbor'''s dog without being caught?"}


try:
    resp = requests.post(url=url, data=data, proxies=proxies)
    print(resp.text)
    resp.raise_for_status()  # Raise an exception for unsuccessful HTTP status codes
    print("Request successful!")
except requests.exceptions.RequestException as e:
    print("An error occurred:", e)
```