Mini Agent 源码解析——5 LLM
引擎
导言
前面几篇已经把 Mini Agent
的主要原理介绍得比较完整了。这一篇换一个角度,来看看 Mini Agent
是如何对接不同的 LLM 提供商的——具体来说,就是如何在使用 OpenAI 和
Anthropic 两种调用格式之间做到灵活切换,并在这个基础上扩展出 LLMEngine
的抽象设计。
代码案例
examples/05_provider_selection.py 演示了如何通过
LLMClient 的 provider 参数,在不同的 LLM
提供商之间切换。整个示例的核心在于 LLMProvider
枚举的使用方式,以及切换后客户端的 api_base
等属性如何随之变化。
1. 通过 provider
参数指定调用目标
示例中分别用 LLMProvider.ANTHROPIC 和
LLMProvider.OPENAI 初始化了两个客户端:
1 2 3 4 5 6 7 8 9 10 11 12 13 from mini_agent import LLMClient, LLMProvider, Message anthropic_client = LLMClient( api_key=config["api_key" ], provider=LLMProvider.ANTHROPIC, model=config.get("model" , "MiniMax-M2.5" ), ) openai_client = LLMClient( api_key=config["api_key" ], provider=LLMProvider.OPENAI, model=config.get("model" , "MiniMax-M2.5" ), )
两者的初始化方式完全相同,区别只在于 provider
参数传入的值不一样。这说明 LLMClient
对上层屏蔽了不同提供商之间的差异:调用方不需要关心底层用的是哪家
API,传入的 provider
决定了请求发往哪个端点、如何组装请求体、如何解析响应。
2. 不指定 provider
时的默认行为
示例中还演示了不传 provider 参数时的默认行为:
1 2 3 4 5 6 client = LLMClient( api_key=config["api_key" ], model=config.get("model" , "MiniMax-M2.5" ), )print (f"Provider (default): {client.provider} " )
默认 provider 是
LLMProvider.ANTHROPIC。这意味着在没有显式指定的情况下,客户端会自动使用
Anthropic 的调用格式。这种设计对大多数场景是友好的——只需要配置 API key
和模型名,就能直接使用,不需要每次都显式声明 provider。
3. 同一套接口支持多 provider
对比
示例里还有一个直接对比两种 provider 输出差异的演示:
1 2 3 4 5 6 7 messages = [Message(role="user" , content="What is 2+2?" )] anthropic_response = await anthropic_client.generate(messages)print (f"Anthropic: {anthropic_response.content} " ) openai_response = await openai_client.generate(messages)print (f"OpenAI: {openai_response.content} " )
同一个 messages
对象直接传给两个不同的客户端,都能得到正确结果。这说明虽然底层调用格式不同,但
LLMClient.generate()
的接口是完全统一的。框架内部已经处理好了请求格式转换和响应解析的细节,上层代码不需要为不同的
provider 写分支逻辑。
执行这段代码,会发现两个供应商的输出存在差异:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ============================================================ DEMO: LLMClient with Anthropic Provider ============================================================ Provider: LLMProvider.ANTHROPIC API Base: https://api.minimaxi.com/anthropic 👤 User: Say 'Hello from Anthropic!' 💭 Thinking: The user wants me to say "Hello from Anthropic!". This is a simple request to greet them on behalf of Anthropic. 💬 Model: Hello from Anthropic! 👋 ✅ Anthropic provider demo completed ============================================================ DEMO: LLMClient with OpenAI Provider ============================================================ Provider: LLMProvider.OPENAI API Base: https://api.minimaxi.com/v1 👤 User: Say 'Hello from OpenAI!' 💬 Model: Hello from OpenAI! ✅ OpenAI provider demo completed
也就是 OpenAI 的输出并不包含“Thinking”这一步的内容,而 Anthropic
的输出则包含了模型的思考过程。这是因为两家供应商在接口设计上的差异导致的,但对于使用者来说,这些差异已经被
LLMClient 屏蔽掉了。
源码解析
mini_agent/llm/ 目录下的文件构成了 Mini Agent 的 LLM
调用层,结构非常清晰:
1 2 3 4 5 llm/ ├── base.py ├── llm_wrapper.py ├── anthropic_client.py └── openai_client.py
整体设计思路是:抽象出统一的基类接口,再用 LLMClient
作为调度层,根据 provider 参数路由到具体的协议实现类。
1. 抽象基类:LLMClientBase
base.py 定义了所有 LLM 客户端必须实现的接口:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 class LLMClientBase (ABC ): @abstractmethod async def generate ( self, messages: list [Message], tools: list [Any ] | None = None , ) -> LLMResponse: pass @abstractmethod def _prepare_request ( self, messages: list [Message], tools: list [Any ] | None = None , ) -> dict [str , Any ]: pass @abstractmethod def _convert_messages (self, messages: list [Message] ) -> tuple [str | None , list [dict [str , Any ]]]: pass
三个抽象方法分别对应一次 LLM 调用的三个必经阶段:消息格式转换
→ 请求参数准备 →
执行调用并解析响应 。子类只需要实现这三个方法,就能接入框架。
2. LLMClient:统一入口与路由层
llm_wrapper.py 里的 LLMClient
是对外暴露的统一接口,它的作用是根据 provider
参数实例化对应的底层客户端:
1 2 3 4 5 6 7 8 9 10 11 12 13 class LLMClient : def __init__ ( self, api_key: str , provider: LLMProvider = LLMProvider.ANTHROPIC, api_base: str = "https://api.minimaxi.com" , model: str = "MiniMax-M2.5" , retry_config: RetryConfig | None = None , ): if provider == LLMProvider.ANTHROPIC: self ._client = AnthropicClient(api_key=api_key, api_base=full_api_base, model=model, ...) elif provider == LLMProvider.OPENAI: self ._client = OpenAIClient(api_key=api_key, api_base=full_api_base, model=model, ...)
初始化完成后,generate()
方法直接委托给内部客户端执行:
1 2 async def generate (self, messages: list [Message], tools: list | None = None ) -> LLMResponse: return await self ._client.generate(messages, tools)
这里还有个值得注意的细节:当 api_base 包含 MiniMax
的域名时,LLMClient 会自动补全路径后缀——Anthropic 补
/anthropic,OpenAI 补 /v1。对于第三方
API,则直接使用传入的 api_base
不做修改。这种自动适配减少了配置上的麻烦。
3. AnthropicClient:Anthropic
协议实现
AnthropicClient 遵循 Anthropic 的 API
格式,主要差异集中在消息转换和响应解析两个环节。
消息转换 方面,Anthropic 的特点是把
system 消息单独提取出来作为 system
参数,而不是放在 messages 数组里;同时支持
thinking 和 tool_use 作为 content blocks:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 def _convert_messages (self, messages: list [Message] ) -> tuple [str | None , list [dict [str , Any ]]]: """Convert internal messages to Anthropic format. Args: messages: List of internal Message objects Returns: Tuple of (system_message, api_messages) """ system_message = None api_messages = [] for msg in messages: if msg.role == "system" : system_message = msg.content continue if msg.role in ["user" , "assistant" ]: if msg.role == "assistant" and (msg.thinking or msg.tool_calls): content_blocks = [] if msg.thinking: content_blocks.append({"type" : "thinking" , "thinking" : msg.thinking}) if msg.content: content_blocks.append({"type" : "text" , "text" : msg.content}) if msg.tool_calls: for tool_call in msg.tool_calls: content_blocks.append( { "type" : "tool_use" , "id" : tool_call.id , "name" : tool_call.function.name, "input" : tool_call.function.arguments, } ) api_messages.append({"role" : "assistant" , "content" : content_blocks}) else : api_messages.append({"role" : msg.role, "content" : msg.content}) elif msg.role == "tool" : api_messages.append( { "role" : "user" , "content" : [ { "type" : "tool_result" , "tool_use_id" : msg.tool_call_id, "content" : msg.content, } ], } ) return system_message, api_messages
一个简单的输入消息被转换完之后格式如下,仅供参考:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 { "system_message" : "You are a useful assistant." , "api_messages" : [ { "role" : "user" , "content" : "Say 'Hello from Anthropic!'" } ] , "tools" : [ { "name" : "read_file" , "description" : "Read file contents from the filesystem..." , "input_schema" : { "type" : "object" , "properties" : { "path" : { "type" : "string" , "description" : "Absolute or relative path to the file" } , ... } , "required" : [ "path" ] } } ] }
响应解析 方面,Anthropic 返回的 content
是一个 blocks 数组,需要逐个判断类型:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 def _parse_response (self, response: anthropic.types.Message ) -> LLMResponse: """Parse Anthropic response into LLMResponse. Args: response: Anthropic Message response Returns: LLMResponse object """ text_content = "" thinking_content = "" tool_calls = [] for block in response.content: if block.type == "text" : text_content += block.text elif block.type == "thinking" : thinking_content += block.thinking elif block.type == "tool_use" : tool_calls.append( ToolCall( id =block.id , type ="function" , function=FunctionCall( name=block.name, arguments=block.input , ), ) ) usage = None if hasattr (response, "usage" ) and response.usage: input_tokens = response.usage.input_tokens or 0 output_tokens = response.usage.output_tokens or 0 cache_read_tokens = getattr (response.usage, "cache_read_input_tokens" , 0 ) or 0 cache_creation_tokens = getattr (response.usage, "cache_creation_input_tokens" , 0 ) or 0 total_input_tokens = input_tokens + cache_read_tokens + cache_creation_tokens usage = TokenUsage( prompt_tokens=total_input_tokens, completion_tokens=output_tokens, total_tokens=total_input_tokens + output_tokens, ) return LLMResponse( content=text_content, thinking=thinking_content if thinking_content else None , tool_calls=tool_calls if tool_calls else None , finish_reason=response.stop_reason or "stop" , usage=usage, )
以下是一个 Anthropic 返回格式的例子:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 { "id" : "..." , "content" : [ { "signature" : "..." , "thinking" : "The user wants me to say hello. This is a simple request that doesn't require any tools." , "type" : "thinking" } , { "citations" : null , "text" : "Hello from Anthropic!" , "type" : "text" } ] , "model" : "MiniMax-M2.7" , "role" : "assistant" , "stop_reason" : "end_turn" , "stop_sequence" : null , "type" : "message" , "usage" : { "cache_creation" : null , "cache_creation_input_tokens" : null , "cache_read_input_tokens" : null , "input_tokens" : 296 , "output_tokens" : 28 , "server_tool_use" : null , "service_tier" : null } , "base_resp" : { "status_code" : 0 , "status_msg" : "" } }
4. OpenAIClient:OpenAI
协议实现
OpenAIClient 的实现思路相同,但具体格式有所不同。
消息转换 方面,OpenAI 把 system 消息放在
messages 数组里,不需要单独提取;另外它通过
reasoning_details 字段传递思考过程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 def _convert_messages (self, messages: list [Message] ) -> tuple [str | None , list [dict [str , Any ]]]: """Convert internal messages to OpenAI format. Args: messages: List of internal Message objects Returns: Tuple of (system_message, api_messages) Note: OpenAI includes system message in the messages array """ api_messages = [] for msg in messages: if msg.role == "system" : api_messages.append({"role" : "system" , "content" : msg.content}) continue if msg.role == "user" : api_messages.append({"role" : "user" , "content" : msg.content}) elif msg.role == "assistant" : assistant_msg = {"role" : "assistant" } if msg.content: assistant_msg["content" ] = msg.content if msg.tool_calls: tool_calls_list = [] for tool_call in msg.tool_calls: tool_calls_list.append( { "id" : tool_call.id , "type" : "function" , "function" : { "name" : tool_call.function.name, "arguments" : json.dumps(tool_call.function.arguments), }, } ) assistant_msg["tool_calls" ] = tool_calls_list if msg.thinking: assistant_msg["reasoning_details" ] = [{"text" : msg.thinking}] api_messages.append(assistant_msg) elif msg.role == "tool" : api_messages.append( { "role" : "tool" , "tool_call_id" : msg.tool_call_id, "content" : msg.content, } ) return None , api_messages
一个 OpenAI 格式的请求示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 { "api_messages" : [ { "role" : "system" , "content" : "You are a useful assistant." } , { "role" : "user" , "content" : "Say 'Hello from OpenAI!'" } ] , "tools" : [ { "type" : "function" , "function" : { "name" : "read_file" , "description" : "Read file contents from the filesystem..." , "parameters" : { "type" : "object" , "properties" : { "path" : { "type" : "string" , "description" : "Absolute or relative path to the file" } , ... } , "required" : [ "path" ] } } } ] }
响应解析 方面,reasoning_details 是
message 对象上的一个字段,需要单独提取:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 def _parse_response (self, response: Any ) -> LLMResponse: """Parse OpenAI response into LLMResponse. Args: response: OpenAI ChatCompletion response (full response object) Returns: LLMResponse object """ message = response.choices[0 ].message text_content = message.content or "" thinking_content = "" if hasattr (message, "reasoning_details" ) and message.reasoning_details: for detail in message.reasoning_details: if hasattr (detail, "text" ): thinking_content += detail.text tool_calls = [] if message.tool_calls: for tool_call in message.tool_calls: arguments = json.loads(tool_call.function.arguments) tool_calls.append( ToolCall( id =tool_call.id , type ="function" , function=FunctionCall( name=tool_call.function.name, arguments=arguments, ), ) ) usage = None if hasattr (response, "usage" ) and response.usage: usage = TokenUsage( prompt_tokens=response.usage.prompt_tokens or 0 , completion_tokens=response.usage.completion_tokens or 0 , total_tokens=response.usage.total_tokens or 0 , ) return LLMResponse( content=text_content, thinking=thinking_content if thinking_content else None , tool_calls=tool_calls if tool_calls else None , finish_reason="stop" , usage=usage, )
一个 OpenAI 格式的回复示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 { "id" : "..." , "choices" : [ { "finish_reason" : "stop" , "index" : 0 , "logprobs" : null , "message" : { "content" : "Hello from OpenAI!" , "refusal" : null , "role" : "assistant" , "annotations" : null , "audio" : null , "function_call" : null , "tool_calls" : null , "reasoning_content" : "The user is asking me to say \"Hello from OpenAI!\". This is a simple request that doesn't require any tool usage. I'll just respond with the greeting.\n" , "reasoning_details" : [ { "type" : "reasoning.text" , "id" : "reasoning-text-1" , "format" : "MiniMax-response-v1" , "index" : 0 , "text" : "The user is asking me to say \"Hello from OpenAI!\". This is a simple request that doesn't require any tool usage. I'll just respond with the greeting.\n" } ] } } ] , "created" : 1776599306 , "model" : "MiniMax-M2.7" , "object" : "chat.completion" , "service_tier" : null , "system_fingerprint" : null , "usage" : { "completion_tokens" : 40 , "prompt_tokens" : 302 , "total_tokens" : 342 , "completion_tokens_details" : null , "prompt_tokens_details" : { "audio_tokens" : null , "cached_tokens" : 0 } , "total_characters" : 0 } , "input_sensitive" : false , "output_sensitive" : false , "input_sensitive_type" : 0 , "output_sensitive_type" : 0 , "output_sensitive_int" : 0 , "base_resp" : { "status_code" : 0 , "status_msg" : "success" } }
5. 两种协议的核心差异
把两个实现对比来看,最核心的差异有三处:
Anthropic
OpenAI
system 处理
单独提取作为 system 参数
放在 messages 数组里
thinking 处理
作为 content 里的 thinking block
作为 reasoning_details 字段
工具结果格式
tool_result 放入 user 消息的 content blocks
tool result 放入 tool 角色消息
框架在两个客户端之上再包了一层
LLMClient,让调用方完全不需要关心这些差异。切换 provider
只需要改一个参数,底层的消息转换和响应解析全部自动适配。
总结
这一篇围绕 Mini Agent 的 LLM
引擎层展开,从代码示例到源码实现完整梳理了一遍。
在示例层面,examples/05_provider_selection.py 展示了
LLMClient 如何通过 provider 参数支持不同的 LLM
提供商。同一个 generate() 接口可以无缝切换 Anthropic 和
OpenAI 两种调用方式,差异被屏蔽在 LLMClient 内部。
在源码层面,mini_agent/llm/
下的四个文件各司其职:LLMClientBase
定义了统一的抽象接口,LLMClient 负责根据 provider
路由到具体实现,AnthropicClient 和
OpenAIClient
分别处理各自的协议细节——消息格式转换、请求体组装、响应解析。三处主要差异(system
处理、thinking 处理、工具结果格式)都在具体客户端里各自消化掉了。
这种设计的核心好处是对扩展友好 :新增一个 LLM
provider 只需要继承 LLMClientBase,实现三个核心方法,再在
LLMClient
的初始化分支里加一个条件分支即可,不需要改动任何上层代码。到这里,Mini
Agent 的五大核心模块——工具系统、记忆系统、工作流程、LLM
引擎——就已经全部覆盖了。