Skip to main content
Don’t let an upstream outage become your outage.
AIHubMix offers two Key level capabilities. Configure them once in the console and they take effect with no client code changes:
  • Model Mapping is the gateway-layer capability that rewrites the model alias in a client request into the real upstream model.
  • Error Fallback is the capability where, when the primary model call fails, the gateway automatically tries backup models in a preconfigured priority order, transparently to the client.
Both capabilities apply to every client and platform that connects through AIHubMix. Whether an upstream channel has a temporary outage, you need disaster recovery across multiple models, or a client only accepts model names in a specific format, in the past you had to change code or build your own gateway to solve it. Now you can do it entirely in the AIHubMix Key configuration, with no client code changes and no self-built gateway.
AIHubMix supports configuring model-name mapping and error fallback at the Key level, and bills by the final responding model. Both are configured per API Key on the AIHubMix Key management page.
When creating or editing a key, configure them in the Model name mapping and Fallback models on error sections of the panel:
Configuring model name mapping and fallback in the AIHubMix Create key panel

1. Model Name Mapping

Model name mapping handles the mismatch between “the model name the client sees” and “the model AIHubMix actually calls.” It is a per-key alias rewrite: it rewrites the alias in the request into the target model you configured on the Key.
After a channel is selected for the target model, the platform internally performs another channel-level mapping to the real upstream model. That layer is transparent to you and requires no configuration. You only need to care about the “alias → target model” layer.
Example:
Client request model name (alias)AIHubMix target model
my-gptgpt-5.5-pro
my-fastgemini-3.1-pro-preview
my-coderdeepseek-v4
my-glmcoding-glm-4.6-free
All model names in the table above can be looked up on the AIHubMix models page.
Common uses:
  • The client restricts the model name format. For example, Claude Desktop requires model names in Claude style (see Section 5).
  • Set a shorter, more stable alias for a complex Model ID.
  • Keep the client configuration unchanged while switching the real model in the AIHubMix backend.
  • Multiple platforms share one connection naming scheme but route to different models based on the Key.
Character-for-character match: The model name the client sends must match the left side of the mapping character for character. For example, my-gpt-5.5 and my-gpt-5-5 are two different strings; if they don’t match, the mapping won’t be hit.

2. Error Fallback

Error fallback tries backup models in order when the primary model fails. It is not client-side retry; it is a model switch performed on the AIHubMix gateway side under the same Key configuration. The integrator does not need to pass any extra routing parameter on each request. You can think of fallback as “mapping to an ordered list”: after the primary model fails, the gateway automatically moves down the list to the next backup model. Example (configured on the same Key):
OrderBackup model
1gpt-5.4
2gemini-3.1-pro-preview

2.1 Trigger Conditions

Fallback only happens when all of the following are true:
  1. The Key is configured with a non-empty backup model list.
  2. Every channel of the primary model has been tried and all failed with a “retryable error” (channels exhausted).
  3. The response has not started returning yet (the first byte / header has not been sent to the client).
  4. The error is not a Key / user-level error (see the 2.2 comparison table below).
After switching to the next backup model, the gateway re-selects a channel with the new model and tries again.

2.2 What Falls Back and What Does Not

SituationFalls back?
Primary model fails “retryably” across all channels, response not started✅ Falls back
A specific channel was specified (Key suffix sk-xxx-{id}, /v1/proxy/{id}/*)
Response has already started returning (first byte of the stream emitted)
Client disconnected / request timed out
Your AIHubMix Key has insufficient quota / is invalid / expired / disabled
Account banned / risk-control keyword hit
Free primary model hits a quota / rate limit✅ Falls back to a paid backup model
A free model in the backup list⏭️ Skips that entry, continues to the next
A model in the backup list that is outside the Key’s available range⏭️ Skipped
Note: “Key invalid” here means your own AIHubMix Key is invalid, which does not fall back. If the key of some upstream channel is broken, the gateway switches channels, and after channels are exhausted it can still fall back. Don’t confuse the two.

2.3 Billing Basis

Billed by the final responding model. If the fallback model ultimately responds, billing, capabilities, and context limits are all based on the final responding model. That model is also reflected in the response header (see Section 4).

2.4 Free Model Rule

A free model cannot be used as a fallback option — a free model can only be the primary model. Putting it in the backup list causes it to be silently skipped and the gateway continues to the next entry. So do not put free models in the fallback list.
Typical usage: Set a free model as the primary model and put paid models in the backup list. When the free primary model hits a quota / rate limit, it automatically falls back to a paid backup model. Normally you save cost using the free quota, and after rate limiting you seamlessly switch to a paid model to guarantee availability. This is one of the most common uses of fallback.

3. AIHubMix vs OpenRouter / LiteLLM

Model mapping and fallback are not new concepts; OpenRouter, LiteLLM, and others provide similar capabilities. What sets AIHubMix apart is the lowest configuration cost:
CapabilityOpenRouterLiteLLMAIHubMix
Where to configureIn code, passing a models array on every requestconfig.yaml, self-built / self-deployed proxy requiredConsole, per API Key
Does enabling require client code changesAdd the array on every requestChange config + run a proxyNo
Self-deployment requiredNoYesNo
Native protocol preserved (Claude / Gemini SDK)Normalized into one schemaTranslated and convertedPreserved
Billed by the final responding modelYes(using your own key)Yes
In one sentence: No self-built gateway, not a single line of client code changed — configure once on the Key and it takes effect.

4. Configuration and Verification

4.1 Configuration

  1. Configure the alias mapping on the Key: the alias on the left must match the model name the client actually sends character for character.
  2. Configure the backup model list (an ordered priority list) on the same Key.
  3. The backup list should only contain paid / available models, not free models (they will be skipped).
  4. Models in the backup list must be within the Key’s range of available models (out-of-scope models will be skipped).

4.2 Verification (check the response headers first, not the logs)

When troubleshooting, don’t just look at which model the client selected. The most authoritative, automatable way is to read the response headers:
  • X-Aihubmix-Fallback: true: a fallback occurred on this request (added when the final model ≠ the primary model).
  • X-Aihubmix-Model: the model that actually responded on this request and was billed accordingly.
curl verification example:
curl -i https://aihubmix.com/v1/chat/completions \
  -H "Authorization: Bearer sk-你的Key" \
  -H "Content-Type: application/json" \
  -d '{"model":"my-gpt","messages":[{"role":"user","content":"hi"}]}' \
  | grep -i -E 'x-aihubmix-(model|fallback)'
The console logs let you cross-check the requested model, the mapped primary model, and the final responding model.

5. Scenario One Claude Desktop

Claude Desktop connects to AIHubMix through Gateway, a typical scenario for model name mapping.
This section assumes you have already completed the basic Claude Desktop integration. For the full integration steps (download and install, developer mode, Gateway configuration, auth scheme, etc.), see Connect AIHubMix in Claude Desktop. This section only covers the incremental configuration for mapping and fallback.

5.1 Why Mapping Is Needed

Claude Desktop connects via Gateway (Anthropic-compatible), and the client constrains model names in Claude style, so model names must use the claude- prefix. This creates a conflict: the client side can only write claude- style names, but what you actually want to call is gpt-5.5-pro, gemini-3.1-pro-preview, and the like. Model name mapping is made exactly for this — the client writes the alias claude-g-p-t-5.5, and AIHubMix maps it to the real gpt-5.5-pro.
Claude Desktop uses the Claude native /v1/messages interface, so in the examples in this article both mapping and Fallback take effect.

5.2 AIHubMix Mapping and Fallback Configuration

Example configuration:
claude-g-p-t-5.5 -> gpt-5.5-pro
claude-gemi-3.1 -> gemini-3.1-pro-preview
claude-depsek-v4 -> deepseek-v4

fallback:
1. gpt-5.4
2. gemini-3.1-pro-preview
AIHubMix backend configuration of model name mapping and the Fallback list

5.3 Claude Desktop Model List

What you configure in Claude Desktop’s Model list is the alias before mapping — that is, the model name Claude Desktop sends to AIHubMix, not the real upstream model name.
Claude Desktop Model list configuring the pre-mapping model alias
After configuration, the corresponding model appears in Claude Desktop’s model dropdown:
Claude Desktop model dropdown showing the configured alias model
Naming suggestions:
  • Use the claude- prefix for the Model ID.
  • Don’t directly write real model family names like gpt, gemini, or deepseek; use aliases such as g-p-t, gemi, depsek.
  • The Model ID must match the left side of the AIHubMix mapping character for character, otherwise the request won’t hit the expected mapping and may continue down to the error fallback model.

6. Scenario Two Multimodal Capability Fallback

Multimodal capability fallback handles the scenario where “the primary model can answer text but does not support the current input type.” For example, the client sends an image or video while the primary model only has text input capability; AIHubMix can continue trying models in the fallback list that support the corresponding modality. Below is a real test path. This Key’s mapping and fallback configuration is as follows (see the screenshot below). The key point is that the fallback list contains both a text model and a model that supports image understanding:
claude-g-l-m-4.6 -> coding-glm-4.6-free
claude-g-p-t-5.5 -> gpt-5.5-pro
claude-gemi-3.1 -> gemini-3.1-pro-preview

fallback:
1. gpt-5.4
2. gemini-3.1-flash-image
3. veo-3
In Claude Desktop, the selected model shows as claude-g-l-m-4.6 — a model that only supports text input. The user uploaded a screenshot of the AIHubMix models list page and asked “what is this website for.” Because the request contained an image, the text model could not directly handle the input, which triggered the fallback.
A fallback model returns the result after uploading an image in Claude Desktop
The AIHubMix logs show that the final actual call this time was Google AI Studio/gemini-3.1-flash-image, the 2nd entry in the fallback list. The 1st entry, gpt-5.4, also doesn’t support this image input and kept returning a retryable error for this request, so the gateway continued down and landed on gemini-3.1-flash-image, which supports image understanding.
AIHubMix logs showing the request finally routed to the fallback model gemini-3.1-flash-image, which supports image understanding
Be clear about the trigger reason: The fallback here happens because the upstream returned a retryable error for this image input and the primary model’s channels were exhausted — it’s the same fallback mechanism as “fall back after the primary model is rate-limited,” just triggered by a different error type (the former is unsupported input, the latter is a quota / rate limit). Distinguish understanding from generation: This refers to image / video understanding fallback, not image generation or video generation. A chat request does not automatically turn into a generation endpoint; to test drawing or video generation, you should use the corresponding generation endpoint and model. Model capabilities are determined by the Input Modalities currently marked on the AIHubMix models page.

7. Scenario Three Free Model Fallback

This is one of the most common uses of fallback: set a free model as the primary model and put paid models in the backup list. Normally all requests go through the free model and save cost; once the free primary model hits a quota / rate limit, the gateway automatically falls back to a paid backup model, ensuring service is not interrupted. Example Key configuration:
primary model (free): coding-glm-4.6-free

fallback:
1. gpt-5.4
2. gemini-3.1-pro-preview
Behavior:
  • While the free quota is still sufficient, the request is answered by the primary model coding-glm-4.6-free and billed as free.
  • After the free primary model is rate-limited, it automatically falls back to gpt-5.4; if gpt-5.4 is also unavailable, it then tries gemini-3.1-pro-preview.
  • Whichever model ultimately responds is the one you are billed for (see 2.3).
Note: A free model can only be the primary model and cannot be put in the fallback list (if you do, it will be skipped; see 2.4). So the correct way to do “free model fallback” is: free as primary, paid as backup, not the other way around.
Verification is again done by reading the response headers: when a fallback occurs, X-Aihubmix-Fallback: true is returned, and X-Aihubmix-Model shows the final responding model (see Section 4).

8. Supported Endpoints

Model mapping and error fallback currently support the following interface categories:
Interface categoryKey alias mappingError Fallback
OpenAI-compatible interfaces (/v1/chat/completions, /v1/completions, /v1/embeddings, /v1/images/*, /v1/audio/transcriptions·/translations, /v1/rerank, /v1/moderations, /v1/edits, etc.)
Claude native /v1/messages
OpenAI Responses /v1/responses
Other native passthrough interfaces (Gemini native, Ideogram, /v1/videos, /v1/audio/speech (TTS), Stability, OCR, /predictions, etc.)
Specific-channel passthrough /v1/proxy/{channelid}/*
Retrieval by resource ID / file-type (GET /v1/responses/{id}, /v1/videos/{id}, files, etc., where the request body has no model)
Key points:
  • Model mapping and error fallback support three interface categories: OpenAI-compatible interfaces, Claude native /v1/messages, and OpenAI Responses /v1/responses.
  • Other native passthrough interfaces (Gemini native, Ideogram, video, TTS, Stability, OCR, predictions, etc.), specific-channel passthrough, and retrieval-by-resource-ID / file-type interfaces are not yet supported.
  • Claude Desktop uses the Claude native /v1/messages, so in the examples in this article both mapping and Fallback take effect.

9. FAQ

Q: What should I do if Claude Desktop shows model not found? A: Check whether the Model ID in Claude Desktop matches the left side of the AIHubMix mapping character for character; if they don’t match, the mapping won’t be hit. Q: Does fallback affect billing? A: Billed by the final responding model. Whichever model ultimately responds, you are charged based on that model’s price, capabilities, and context limits. Q: How do I confirm whether this request actually used fallback? A: Look at the response headers X-Aihubmix-Fallback: true (a fallback occurred) and X-Aihubmix-Model (the final responding model); see Section 4. Q: Which errors trigger fallback and which don’t? A: See the 2.2 comparison table. In short: fallback only happens on an upstream retryable failure, channels exhausted, and the response not yet started; specific channel, response already started, client disconnect / timeout, and Key / user-level errors do not fall back. Q: Can a free model be put in the fallback list? A: No, it will be skipped. A free model can only be the primary model. Q: How is this different from OpenRouter / LiteLLM’s model alias / fallback? A: AIHubMix is Key-level and platform-managed — configure it once in the console and it takes effect, with no client code changes and no self-built gateway. See Section 3 for details.