Topic Filtering

Validates that content does not contain certain topics.

Options:

  • denied_topics: An array of topics to disallow, e.g., ["medical advice", "profanity"]
  • threshold: Float between 0-1, describes how sensitive should the classifier be. A higher threshold means, the content has to be highly related with the topic for the guardrail to fail the request.

Example

name: content-filter-guardrails
type: content-filter-guardrails-config
rules:
  - id: content-standard-enforcement
    when:
      models:
        - openai-main/gpt-3-5-turbo
    input_guardrails:
      - type: topics
        action: validate
        options:
          threshold: 0.8
          denied_topics:
            - medical advice
            - profanity
    output_guardrails:
      - type: topics
        action: validate
        options:
          threshold: 0.9
          denied_topics:
            - medical advice
            - profanity
guardrails_service_url: https://content-filter-service.company.com