Skip to main content
AIHubMix June 2026 Release Spotlight: new models and platform capabilities
This month AIHubMix added around 20 new models across chat, code, video, and image, and shipped several platform capabilities. The same API key now reaches even more. Here are the highlights.

Auto Router

Set the model name to auto, and the gateway selects the best model out of the hundreds on the platform based on your request — with cost-first, quality-first, or low-latency strategies, billed by the model it actually hits. No manual comparison or model switching, and no client code changes. See Auto Router.

Any model on the Responses protocol

The /v1/responses endpoint is no longer limited to the GPT family — it can now call any model on the platform. Tools built on the Responses protocol (such as Codex CLI) can therefore use GLM, Gemini, DeepSeek, Kimi, Qwen, and more via a local model catalog, instead of being restricted to OpenAI’s official models. See Codex CLI · Custom Models.

Model Mapping & Fallback

Configure alias mapping and failure fallback per API key in the console: your client can use any model name, which the gateway rewrites to the real upstream model; if the primary fails, it switches to a backup automatically, billed by the model that finally responds. A single hiccup won’t drop your production traffic, and the client code stays untouched. See Model Mapping & Fallback.

AIHubMix CLI

A single binary with zero dependencies — no Python, Node, or Go required. Query your balance, manage API keys, and list available models straight from the terminal, with first-class support for scripts and AI agents like Claude Code. See AIHubMix CLI.

AIHubMix Skill (extension for AI coding agents)

A local extension for AI agents that support Skills — Codex, Claude Code, Cursor, Cline, and more. Use natural language to integrate AIHubMix, query models, select by capability, generate examples, and troubleshoot errors. Rather than bundling a fixed model list, the Skill reads live model, pricing, and protocol information from AIHubMix’s official APIs on demand, so the agent never relies on stale memory. See Skills.

Backup domain: api.inferera.com

When the main domain aihubmix.com is unreachable or times out, point your requests at https://api.inferera.com. Endpoints and capabilities are identical — your API key, model, and request body don’t change.

Also shipped

  • Gemini audio input: the OpenAI-compatible endpoint (/v1/chat/completions) now accepts input_audio and returns audio_tokens in usage.
  • GLM 5.2 reasoning effort: the native Zhipu channel supports reasoning_effort for adjustable thinking depth.
  • Open Design integration: AIHubMix is now a built-in BYOK gateway for Open Design.
  • OpenClaw plugin fix: aihubmix-auth is fixed and stable to use.

Stability & fixes

  • Improved billing precision and cache metering accuracy.
  • Fixed missing models in /v1/models.
  • Fixed several video generation and channel testing issues.

New models this month (~20)

Chat / General
  • claude-fable-5 [Retired]: Claude’s latest generation, with stronger safety guardrails (see Changelog · Fable 5 notes).
  • minimax-m3, qwen3.7-plus, glm-5.2, and Doubao doubao-seed-2-1-pro / doubao-seed-2-1-turbo.
Code
  • kimi-k2.7-code and kimi-k2.7-code-highspeed: Kimi’s code series, including a high-speed variant.
  • coding-glm-5.2 and the free coding-glm-5.2-free.
Video
  • Kling: text-to-video, image-to-video, multi-image reference, and omni multimodal generation.
  • happyhorse-1.1: text-to-video (t2v), reference (r2v), and image-to-video (i2v).
Image
  • Baidu musesteamer-air-image for image generation.
Also new
  • grok-build-0.1, hy3-preview, and the free step-3.7-flash-free.

Pricing & notices

  • step-3.7-flash, 90% off (limited time): 0.022/Minputtokens,0.022 / M input tokens, 0.132 / M output tokens.
  • Deprecation & auto-routing: claude-opus-4-20250514 and claude-sonnet-4-20250514 were retired upstream on June 15; the platform auto-routes them to the same-family 4-5 versions.

FAQ

Which models were added this month? Around 20, spanning chat (claude-fable-5 [Retired], minimax-m3, qwen3.7-plus, glm-5.2, doubao-seed-2-1 series), code (kimi-k2.7-code series, coding-glm-5.2), video (Kling, happyhorse-1.1), and image (musesteamer-air-image). How do I use the Auto Router? Set the model name in your request to auto; the gateway selects the best model based on your request and bills by the model it actually hits, with no client code changes. See Auto Router. What if the main domain is unreachable? Replace the request address with the backup domain https://api.inferera.com. Endpoints and capabilities are identical, and no parameters need to change. What is the limited-time price for step-3.7-flash? 0.022/Minputtokensand0.022 / M input tokens and 0.132 / M output tokens. Browse all models on the model catalog, and find integration details in the docs.
Updated: 2026-06-30