AiHubMix Documentation Hub

추론 구성

reasoning 파라미터를 사용하여 추론 동작을 구성할 수 있습니다:

curl -X POST https://aihubmix.com/v1/responses \
  -H "Authorization: Bearer YOUR_AIHUBMIX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-5",
    "input": "Plan a week-long trip to the US for me.",
    "reasoning": {
      "effort": "high"
    },
    "max_output_tokens": 5000
  }'

추론 강도

effort 파라미터는 모델이 추론에 투자하는 계산 자원의 양을 제어하며, 본질적으로 추론 노력 수준을 결정합니다.

추론 수준	설명
minimal	최소한의 계산을 사용한 기본 추론
low	간단한 질문에 적합한 가벼운 추론
medium	중간 정도의 복잡한 문제에 적합한 균형 잡힌 추론
high	복잡한 문제에 적합한 심층 추론

대화에서 추론 사용

추론 기능은 다중 턴 대화에서도 활용할 수 있습니다:

import requests

url = "https://aihubmix.com/v1/responses"

headers = {
    "Authorization": "Bearer YOUR_AIHUBMIX_API_KEY",
    "Content-Type": "application/json",
}

data = {
    "model": "kimi-k2.5",
    "input": [
        {
            "type": "message",
            "role": "user",
            "content": [
                {
                    "type": "input_text",
                    "text": "What is your favorite animal?",
                }
            ],
        },
        {
            "type": "message",
            "role": "assistant",
            "id": "msg_123",
            "status": "completed",
            "content": [
                {
                    "type": "output_text",
                    "text": "I don't have a favorite animal.",
                    "annotations": []
                }
            ],
        },
        {
            "type": "message",
            "role": "user",
            "content": [
                {
                    "type": "input_text",
                    "text": "Why is the sky blue?",
                }
            ],
        },
    ],
    "reasoning": {
        "effort": "high"
    },
    "max_output_tokens": 5000,
}

response = requests.post(url, headers=headers, json=data)

print(response.status_code)
print(response.json())

추론 정보가 포함된 응답

추론이 활성화되면 API는 추론 데이터를 포함한 결과를 반환합니다:

{
  "id": "resp_051e00420efb9e150069aff6a18418819591abb7ce5f8487ed",
  "object": "response",
  "created_at": 1773139617,
  "status": "completed",
  "background": false,
  "completed_at": 1773139621,
  "content_filters": [
    {
      "blocked": false,
      "source_type": "completion",
      "content_filter_raw": [],
      "content_filter_results": {},
      "content_filter_offsets": {
        "start_offset": 0,
        "end_offset": 1147,
        "check_offset": 0
      }
    }
  ],
  "error": null,
  "frequency_penalty": 0.0,
  "incomplete_details": null,
  "instructions": null,
  "max_output_tokens": 5000,
  "max_tool_calls": null,
  "model": "gpt-54",
  "output": [
    {
      "id": "rs_051e00420efb9e150069aff6a32f948195996db3ff98314ef2",
      "type": "reasoning",
      "summary": []
    },
    {
      "id": "msg_051e00420efb9e150069aff6a33d808195825716a666d8ba8b",
      "type": "message",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "annotations": [],
          "logprobs": [],
          "text": "The sky looks blue because of how sunlight interacts with Earth’s atmosphere.\n\n1. **Sunlight isn’t just “white”**\nSunlight is made of many colors (red, orange, yellow, green, blue, violet), each with different wavelengths.\n\n2. **Air scatters short wavelengths more**\nAs sunlight passes through the atmosphere, it hits gas molecules and tiny particles.\n- Shorter wavelengths (blue, violet) are scattered in all directions much more than longer wavelengths (red, orange).\n- This effect is called **Rayleigh scattering**.\n\n3. **We see more blue than violet**\n- Our eyes are more sensitive to blue than to violet.\n- Some violet light is also absorbed higher in the atmosphere.\nSo the scattered light we perceive is mostly blue.\n\n4. **Why sunsets are red/orange**\nAt sunrise and sunset, sunlight passes through much more atmosphere.\n- Most of the blue light gets scattered out of the direct path.\n- The remaining light reaching your eyes from the Sun is richer in reds and oranges."
        }
      ]
    }
  ],
  "parallel_tool_calls": true,
  "presence_penalty": 0.0,
  "previous_response_id": null,
  "prompt_cache_key": null,
  "prompt_cache_retention": null,
  "reasoning": {
    "effort": "high",
    "summary": null
  },
  "safety_identifier": null,
  "service_tier": "default",
  "store": true,
  "temperature": 1.0,
  "text": {
    "format": {
      "type": "text"
    },
    "verbosity": "medium"
  },
  "tool_choice": "auto",
  "tools": [],
  "top_logprobs": 0,
  "top_p": 1.0,
  "truncation": "disabled",
  "usage": {
    "input_tokens": 35,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 267,
    "output_tokens_details": {
      "reasoning_tokens": 29
    },
    "total_tokens": 302
  },
  "user": null,
  "metadata": {}
}

사용 권장 사항

적절한 추론 노력 수준 선택: 복잡한 문제에는 high를, 간단한 작업에는 low를 사용하세요.
토큰 사용량 고려: 추론은 토큰 소비를 증가시킵니다.
스트리밍 활용: 긴 추론 체인의 경우 스트리밍은 사용자 경험을 향상시킬 수 있습니다.
컨텍스트 제공: 모델이 효과적으로 추론할 수 있도록 충분한 컨텍스트를 제공하세요.

마지막 업데이트: 2026-06-01

시작하기

게이트웨이 기능

모델 기능

프로토콜 참조

플랫폼 관리

개인정보 및 약관

Reasoning

추론 구성

추론 강도

대화에서 추론 사용

추론 정보가 포함된 응답

사용 권장 사항

​추론 구성

​추론 강도

​대화에서 추론 사용

​추론 정보가 포함된 응답

​사용 권장 사항

추론 구성

추론 강도

대화에서 추론 사용

추론 정보가 포함된 응답

사용 권장 사항