Navigate to Guardrails
omni-moderation-latest
.Fill in the OpenAI Moderation Form
flagged
field indicates if any category has been triggered.categories
object returns true
if that specific category is flagged.category_scores
provide confidence scores (0-1) for each category.Customizable Thresholds
categories.violence
is true
, or if category_scores.violence
exceeds the configured threshold, the content is flagged for violence.categories.harassment
is false
, and category_scores.harassment
is below the threshold, the content passes the harassment check.category_scores.violence
value of 0.8599265510337075
indicates a high confidence (85.99%) that the content contains violence.true
or by exceeding its threshold), TrueFoundry will block the request and return an appropriate error message to maintain content safety standards.