Jina AI Integration
Description
We have integrated the three core interfaces of Jina AI, helping you easily build powerful intelligent agents. These interfaces are primarily suitable for the following scenarios:
- Vector Embeddings (Embeddings):Applicable to multi-modal RAG question-answering scenarios, such as smart customer service, smart recruitment, and knowledge base question-answering.
- Reranking (Rerank):By optimizing the Embedding candidate results, sorting them based on topic relevance, significantly improving the quality of the answers from large language models.
- Deep Search (DeepSearch):Perform deep search and reasoning until the optimal answer is found, particularly suitable for complex tasks such as research projects and product solution development.
We’ve enhanced the Jina AI API to support future extensions, so the usage may differ slightly from the official native implementation.
Quick Start
Replace the API_KEY
with AIHUBMIX_API_KEY and the model endpoint link, and the other parameters and usage are fully consistent with Jina AI official.
Endpoint Replacement:
- Vector Embeddings (Embeddings):
https://jina.ai/embeddings
->https://aihubmix.com/v1/embeddings
- Reranking (Rerank):
https://api.jina.ai/v1/rerank
->https://aihubmix.com/v1/rerank
- Deep Search (DeepSearch):
https://deepsearch.jina.ai/v1/chat/completions
->https://aihubmix.com/v1/chat/completions
Embeddings
Jina AI’s Embedding supports both plain text and multi-modal images, and performs excellently in handling multi-language tasks.
Request Parameters
Model name, available model list:
jina-clip-v2
:Multi-modal, multilingual, 1024-dimensional, 8K context window, 865M parametersjina-embeddings-v3
:Text model, multilingual, 1024-dimensional, 8K context window, 570M parametersjina-colbert-v2
:Multi-language ColBERT model, 8K token context, 560M parameters, used for embedding and rerankingjina-embeddings-v2-base-code
:Model optimized for code and document search, 768-dimensional, 8K context window, 137M parameters
Input text or image, different models support different input formats. For text, provide an array of strings; for multi-modal models, provide an array of objects containing text or image fields.
Data type returned, optional values:
float
:Default, return a float array. The most common and easy-to-use format, return a list of floatsbinary_int8
:Return as int8 packed binary format. More efficient storage, search, and transmissionbinary_uint8
:Return as uint8 packed binary format. More efficient storage, search, and transmissionbase64
:Return as base64 encoded string. More efficient transmission
The number of dimensions used in computation. Supported values:
- 1024
- 768
1. Multimodal Usage
2. Pure Text Usage
Only provide an array of text strings, do not provide the image
field.
Rerank
Reranker aims to improve search relevance and RAG accuracy. It deep analyzes the initial search results, considers the subtle interactions between the query and document content, and reorders the search results to place the most relevant results at the top.
Request Parameters
Model name, available model list:
jina-reranker-m0
:Multimodal multilingual document reranker, 10K context, 2.4B parameters, for visual document sorting
Search query text, used to compare with candidate documents
The number of most relevant documents to return. Default returns all documents
Array of candidate documents, will be reordered based on relevance to the query
Maximum chunk length per document, applicable only to Cohere (not supported by Jina). Defaults to 4096.
Long documents will be automatically truncated to the specified number of tokens.
1. Multimodal Usage
Response Description
Response successful:
model
: The name of the model usedresults
: An array of reranking results sorted by relevance score in descending order, each element contains:index
: The index position in the original document arrayrelevance_score
: A relevance score between 0-1, higher scores indicate greater relevance to the query
usage
: Usage statisticstotal_tokens
: Total number of tokens processed in this request
2. Text Usage
Text reranking supports both multilingual and regular tasks, similar to embedding usage, by passing in an array.
DeepSearch
DeepSearch combines search, reading, and reasoning capabilities to pursue the best possible answer. It’s fully compatible with OpenAI’s Chat API format—just replace api.openai.com with aihubmix.com to get started.
The stream will return the thinking process.
Request Parameters
Model name, available models:
jina-deepsearch-v1
:Default model, search, read and reason until the best answer is found
Whether to enable streaming response. It is strongly recommended to keep this option enabled, DeepSearch requests may take a long time to complete, disabling streaming may result in a ‘524 Timeout’ error
The list of conversation messages between the user and the assistant. Supports multiple types (modal) messages, such as text (.txt, .pdf), images (.png, .webp, .jpeg), etc. The maximum file size is 10MB
Multimodal Message Format
DeepSearch supports multiple types of message formats, which can include pure text (message), files (file), and images (image). The following are examples of different formats:
1. Pure Text Message
2. Message with File Attachment
3. Message with Image
All files and images must be encoded in data URI format (data URI) in advance, with a maximum file size of 10MB.
Example of Calling
Please note that Jina AI’s Python streaming call on the official website will not have a response; please refer to our example.
Response Description
The response from DeepSearch is streamed by default, including both intermediate reasoning steps and the final answer. The last block of the stream contains the final response, a list of visited URLs, and token usage details. If streaming is disabled, only the final answer will be returned—intermediate “thinking” steps will be omitted. Note: This JSON object differs from the format used by Jina AI.
Python Return Example: