AiHubMix Documentation Hub

Support the Openai Responses API multi-function interface, the following functions have been launched:

Text input: Text input
Image input: Image input
Streaming: Streaming
Web search: Web search
Deep research: For complex analysis and research tasks
Reasoning: Reasoning depth control, supports 4 levels (minimal / low / medium /high). Only the gpt-5 series supports minimal.
Verbosity: Output length, gpt-5 series supports 3 levels (low / medium /high)
Functions: Functions
image_generation tool usage: Image drawing and generation are billed under gpt-image-1.
Code Interpreter: Allow models to write and run Python to solve problems. reasoning.effort ‘minimal’ is not supported while using code interpreter with gpt-5.
Remote MCP: Calling a remote MCP server
Computer Use: Computer Use

Usage (Python call)：

Same as the official OpenAI call method, just replace api_key and base_url for forwarding. Mainland China can access directly.

client = OpenAI(
    api_key="AIHUBMIX_API_KEY", # Replace with the key you generated in AiHubMix
    base_url="https://aihubmix.com/v1"
)

For inference models, the output inference summary can be controlled using the following parameter, with the detail richness of the summary ranked as detailed > auto > None, where auto provides the best balance.

"summary": "auto"

Optional deep reasoning models: ‎⁠o3-deep-research⁠ and ‎⁠o4-mini-deep-research⁠, only supported on the ‎⁠responses⁠ endpoint.
The gpt-5 series focuses on stable reasoning and consistent outputs, and no longer supports the temperature and top_p parameters for controlling randomness. If you need more freedom, you can try gpt-5-chat-latest, which supports temperature.
Reasoning models (o series / gpt-5 series) have deprecated max_tokens. Please use max_completion_tokens for completions or max_output_tokens for responses to explicitly set the output token limit.

from openai import OpenAI

client = OpenAI(
    api_key="sk-***", # Replace with the key generated in your AIHubMix dashboard
    base_url="https://aihubmix.com/v1"
)

response = client.responses.create(
    model="gpt-5", # gpt-5, gpt-5-chat-latest, gpt-5-mini, gpt-5-nano
    input="Why is tarot divination effective? What are the underlying principles and transferable methods? Output format: Markdown", # GPT-5 does not output in Markdown format by default, so you need to explicitly specify it.
    reasoning={
        "effort": "minimal" # Reasoning depth – Controls how many reasoning tokens the model generates before producing a response. Value can be "minimal", "low", "medium" or "high". Default is "medium".
    },
    text={
        "verbosity": "low" # Output length – Verbosity determines how many output tokens are generated. Value can be "low", "medium", or "high". Models before GPT-5 defaulted to "medium" verbosity.
    },
    stream=True
)

for event in response:
  print(event)

Note:

The latest codex-mini-latest does not support search.
The Computer use feature requires integration with Playwright. It’s recommended to refer to the official repository.

Known issues:

Use cases are complex to invoke
Takes many screenshots, which is time-consuming and often unreliable
May trigger CAPTCHA or Cloudflare human verification, potentially leading to infinite loops

Basics

API

Terms and Privacy

Openai Responses API Support

Usage (Python call)：

Basics

API

Terms and Privacy

​Usage (Python call)：

Usage (Python call)：