Use content moderation guardrails to filter inappropriate, harmful, or illegal content in LLM interactions
Content moderation is the process of reviewing and managing user-generated content on online platforms to ensure it aligns with community guidelines and legal regulations. It involves filtering, reviewing, and restricting content that violates these standards.Why is content moderation important?
Maintains a safe and positive environment for users.
Protects against harmful or illegal content.
Ensures compliance with regulations and builds user trust.
OpenAI Moderation
You can use the OpenAI Moderation integration on TrueFoundry to achieve real-time analysis of text and images for harmful or inappropriate content. A brief tutorial for setting this up is provided at the end of this page.
AWS Bedrock Guardrails
You can leverage AWS Bedrock Guardrails integration on TrueFoundry to enforce pre-defined or custom content policies, enabling real-time blocking of policy violations and context-aware filtering.
Azure Content Safety
You can use the Azure Content Safety integration on TrueFoundry to detect and mitigate harmful, unsafe, or inappropriate content in model inputs and outputs.
Guardrails AI (Custom Guardrail Integration)
You can implement custom content moderation logic using the custom guardrail integration option on TrueFoundry Gateway. As a starting point, refer to the TrueFoundry Guardrail Template Repository, which can be extended to suit your specific requirements.