Trust and Safety with LLM

The Guardrails service enhances the security of LLM-based applications by offering a suite of microservices designed to ensure trustworthiness, safety, and security.

MicroService

Description

Contributors

Bias Detection

Detects Biased language (framing bias, epistemological bias, and demographic bias)

Intel

Factuality Alignment

Detects hallucination by checking for factual consistency between two text passages

PredictionGuard

Guardrails

Provides general guardrails for inputs and outputs to ensure safe interactions using Llama Guard or WildGuard

Intel

Hallucination

Detects hallucination given a text document, question and answer

Intel

PII Detection

Detects Personally Identifiable Information (PII) and Business Sensitive Information (BSI)

Prediction Guard

Prompt Injection Detection

Detects malicious prompts causing the system running an LLM to execute the attacker’s intentions

Intel, PredictionGuard

Toxicity Detection

Detects Toxic language (rude, disrespectful, or unreasonable language that is likely to make someone leave a discussion)

Intel, Prediction Guard

Additional safety-related microservices will be available soon.