Provides a content moderation API fully compliant with OpenAI standards, allowing developers to use this interface to automatically identify harmful content (such as hate speech, violence, illicit activities, etc.) in text or images using multimodal moderation models, ensuring application compliance.
This endpoint supports two models:1. omni-moderation-latest: This model and all snapshots support more classification options and multimodal input.2. text-moderation-latest: Supports only text input, with fewer classification options.
import openaiclient = openai.OpenAI( api_key="AIHUBMIX_API_KEY", base_url="https://aihubmix.com/v1")response = client.moderations.create( model="text-moderation-latest", input="The Yangtze River rolls eastward, its waves washing away heroes. Right and wrong, success and failure, all seem empty; the green hills remain, though the sun sets many times. The white-haired fisherman and woodcutter on the riverbank, accustomed to watching the autumn moon and spring breeze. A pot of turbid wine brings joy in meeting, how many events through time are all laughed off.",)print(response)
The output result includes several categories in the JSON response, which inform you about the types of content present in the input (if any) and the extent to which the model believes they are present.
Output Category
Description
flagged
Set to true if the model classifies the content as potentially harmful,false otherwise.
categories
Contains a dictionary of per-category violation flags. For each category, the value is true if the model flags the corresponding category as violated, false otherwise.
category_scores
Contains a dictionary of per-category scores output by the model, denoting the model’s confidence that the input violates the OpenAI’s policy for the category. The value is between 0 and 1, where higher values denote higher confidence.
category_applied_input_types
This property contains information on which input types were flagged in the response, for each category. For example, if the both the image and text inputs to the model are flagged for “violence/graphic”, the violence/graphic property will be set to ["image", "text"]. This is only available on omni models.
The table below describes the types of content that the moderation API can detect, along with the models and input types supported for each category.
Categories labeled “text-only” do not support image input. If you send only images to the model (without text) using omni-moderation-latest, the model will return a score of 0 for these unsupported categories.
Category
Description
Model
Input
harassment
Content that expresses, incites, or promotes harassing language towards any target.
All
Text only
harassment/threatening
Harassment content that also includes violence or serious harm towards any target.
All
Text only
hate
Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. Hateful content aimed at non-protected groups (e.g., chess players) is harassment.
All
Text only
hate/threatening
Hateful content that also includes violence or serious harm towards the targeted group based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.
All
Text only
illicit
The same types of content flagged by the illicit category, but also includes references to violence or procuring a weapon.
Omni only
Text only
illicit/violent
Similar to the content type marked illicit, but also includes mentions of violence or acquiring weapons.
Omni only
Text only
self-harm
Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.
All
Text and Images
self-harm/intent
Content where the speaker expresses that they are engaging or intend to engage in acts of self-harm, such as suicide, cutting, and eating disorders.
All
Text and Images
self-harm/instructions
Content that encourages performing acts of self-harm, such as suicide, cutting, and eating disorders, or that gives instructions or advice on how to commit such acts.
All
Text and Images
sexual
Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness).
All
Text and Images
sexual/minors
Sexual content that includes an individual who is under 18 years old.
All
Text only
violence
Content that depicts death, violence, or physical injury.
All
Text and Images
violence/graphic
Content that depicts death, violence, or physical injury in graphic detail.