Trust and Safety with LLM¶

The Guardrails service enhances the security of LLM-based applications by offering a suite of microservices designed to ensure trustworthiness, safety, and security.

MicroService	Description	Contributors
Bias Detection	Detects Biased language (framing bias, epistemological bias, and demographic bias)	Intel
Factuality Alignment	Detects hallucination by checking for factual consistency between two text passages	PredictionGuard
Guardrails	Provides general guardrails for inputs and outputs to ensure safe interactions using Llama Guard or WildGuard	Intel
Hallucination	Detects hallucination given a text document, question and answer	Intel
PII Detection	Detects Personally Identifiable Information (PII) and Business Sensitive Information (BSI)	Prediction Guard
Prompt Injection Detection	Detects malicious prompts causing the system running an LLM to execute the attacker’s intentions	Intel, PredictionGuard
Toxicity Detection	Detects Toxic language (rude, disrespectful, or unreasonable language that is likely to make someone leave a discussion)	Intel, Prediction Guard

Additional safety-related microservices will be available soon.