diff --git a/.vitepress/config.mts b/.vitepress/config.mts index 34810f72..fa8e4b2d 100644 --- a/.vitepress/config.mts +++ b/.vitepress/config.mts @@ -244,6 +244,16 @@ export default defineConfig({ collapsed: false, items: [ { text: 'Concepts', link: 'tutorials/concepts.md' }, + { + text: 'Atomic Agent', + link: 'tutorials/atomic_roles/intro.md', + items: [ + { + text: 'RoleZero', + link: 'tutorials/atomic_roles/role_zero.md', + }, + ], + }, { text: 'Agent 101', link: 'tutorials/agent_101.md' }, { text: 'MultiAgent 101', @@ -538,6 +548,16 @@ export default defineConfig({ text: '概念简述', link: 'tutorials/concepts', }, + { + text: '原子化智能体', + link: 'tutorials/atomic_roles/intro.md', + items: [ + { + text: 'RoleZero', + link: 'tutorials/atomic_roles/role_zero.md', + }, + ], + }, { text: '智能体入门', link: 'tutorials/agent_101', diff --git a/src/en/guide/get_started/configuration/llm_api_configuration.md b/src/en/guide/get_started/configuration/llm_api_configuration.md index 11363657..7afbedb5 100644 --- a/src/en/guide/get_started/configuration/llm_api_configuration.md +++ b/src/en/guide/get_started/configuration/llm_api_configuration.md @@ -1,4 +1,5 @@ # LLM API Configuration + # LLM API Configuration After completing the installation, follow these steps to configure the LLM API, using the OpenAI API as an example. This process is similar for other LLM APIs. @@ -38,7 +39,7 @@ llm: It can be used to initialize LLM. Due to some restrictions on the use of o1 series, problems can be reported to us in time. -With these steps, your setup is complete. For starting with MetaGPT, check out the [Quickstart guide](./quickstart) or our [Tutorials](/en/guide/tutorials/agent_101). +With these steps, your setup is complete. For starting with MetaGPT, check out the [Quickstart guide](./quickstart.md) or our [Tutorials](/en/guide/tutorials/agent_101.md). MetaGPT supports a range of LLM models. Configure your model API keys as needed. diff --git a/src/en/guide/in_depth_guides/agent_communication.md b/src/en/guide/in_depth_guides/agent_communication.md index d88cb3cf..871c107e 100644 --- a/src/en/guide/in_depth_guides/agent_communication.md +++ b/src/en/guide/in_depth_guides/agent_communication.md @@ -15,6 +15,8 @@ class Message(BaseModel): cause_by: str = Field(default="", validate_default=True) sent_from: str = Field(default="", validate_default=True) send_to: set[str] = Field(default={MESSAGE_ROUTE_TO_ALL}, validate_default=True) + metadata: Dict[str, Any] = Field(default_factory=dict) # metadata for `content` and `instruct_content` + ``` When planning the message forwarding process between agents, it's essential to first determine the functional boundaries of the agents, similar to designing a function: diff --git a/src/en/guide/in_depth_guides/environment/intro.md b/src/en/guide/in_depth_guides/environment/intro.md index 9ab20400..5a023eaf 100644 --- a/src/en/guide/in_depth_guides/environment/intro.md +++ b/src/en/guide/in_depth_guides/environment/intro.md @@ -12,6 +12,13 @@ In `ExtEnv`, we refer to the design of `gymnasium` in the reinforcement learning In addition, the decorators `mark_as_readable` and `mark_as_writeable` for the different `read-write` interfaces provided by `ExtEnv` are also provided to facilitate the unified management of method interfaces for external environment docking, so that subsequent agents can use them as A tool capability that can directly and automatically call different external environment docking interfaces based on the input natural language (this part of the function is to be opened). +### Observation space, action space definition specification + +When defining observation and action space in `gymnasium`, discrete values ​​or continuous values ​​are generally defined. However, in the supported scene environments, game engine services or external simulators are more often accessed through APIs or interfaces. Therefore, for the action space (`gymnasium.spaces.Dict`), it contains subspace definitions of different action types and required input parameters under different actions. For the search space (`gymnasium.spaces.Dict`), it contains environmental information that can be obtained from the environment, such as maps, screenshots, etc. + +`BaseEnvActionType` in `metagpt.base.base_env_space` defines the action type, `BaseEnvAction` defines a set of values ​​corresponding to the action space, and `BaseEnvObsType` defines the observation type. +Generally, the observation space values ​​obtained in `gymnasium` are a complete set of observation values, but in practical applications, it is often necessary to obtain local observation values ​​from the environment (for example, in Stanford Town, it is necessary to obtain map information within the field of view of the agent's location, rather than the complete map). We have added the `observe(self, obs_params: Optional[BaseEnvObsParams] = None)` method to obtain local environment information. `BaseEnvObsParams` defines the parameters required to obtain observation values, including the observation type and its required input parameters. + ## Different Environments Currently, we provide several scenario environments and provide corresponding scenario usage entrances under `MetaGPT/examples/`. @@ -20,4 +27,5 @@ Currently, we provide several scenario environments and provide corresponding sc - Added, [Werewolf Environment](./werewolf.md) - Added, [Stanford Town Environment](./stanford_town.md) - Added, [Android Environment](./android.md) +- Added, [MGXEnv Environment](./mgx.md) - ToBeAdded, [Web Environment](./web.md) diff --git a/src/en/guide/in_depth_guides/environment/mgx.md b/src/en/guide/in_depth_guides/environment/mgx.md new file mode 100644 index 00000000..8a1bb258 --- /dev/null +++ b/src/en/guide/in_depth_guides/environment/mgx.md @@ -0,0 +1,96 @@ +# MGX Environment + +[Code Entry](https://github.com/geekan/MetaGPT/tree/main/metagpt/environment/mgx/mgx_env.py) + +MGXEnv is a generic multi-agent collaboration environment that provides a flexible and powerful interaction framework. During initialization, the environment supports configuring multiple agents with different roles, each equipped with a specific prompt system to guide their behavior and responsibilities. The core feature of the environment is its unique message management mechanism: TeamLeader acts as a central coordinator, uniformly managing the flow and distribution of all messages. This design ensures both the orderliness of information transmission and supports flexible interaction methods, including public dialogue and private communication. Through this architectural design, MGXEnv can effectively support complex multi-agent collaboration scenarios, enabling different roles to efficiently complete division of labor and cooperation according to their respective professional fields and task requirements. + +## Space Definition + +### Message Space + +MGXEnv mainly handles message routing and publishing in a multi-agent environment. The core message space is defined by the Message class with the following structure: + +Definition: +```python +from gymnasium import spaces + +space = { + "role": spaces.Text(16), # Message role type + "content": spaces.Text(1024), # Actual message content + "sent_from": spaces.Text(32), # Sender name + "send_to": spaces.Set(spaces.Text(32)), # Set of recipient names + "metadata": spaces.Dict(), # Additional metadata like images +} +``` + +Message Space Components: + +| Field | Description | Value Range | +|-------|-------------|-------------| +| role | Message role type | One of ["user", "assistant", "system"] | +| content | Actual message content | Maximum length 1024 characters | +| sent_from | Message sender name | Maximum length 32 characters | +| send_to | Set of recipient names | Each name maximum 32 characters | +| metadata | Additional message metadata | Dictionary containing optional fields (like images) | + +Message Example: +```python +from metagpt.schema import Message + +Message( + role="assistant", + content="Analysis completed.", + sent_from="Alice", + send_to={"Mike", ""}, + metadata={"agent": "Emma"} +) +``` + +### Communication Modes + +The environment supports two communication modes: + +1. Public Chat Mode (default) +- All messages visible to all roles (send_to includes ) +- Message flow coordinated by team leader (Mike) +- Messages stored in environment history + +2. Direct Chat Mode +- Triggered when user directly messages a specific role +- Communication only between user and target role +- Bypasses team leader +- Message publishing to all depends on is_public_chat flag + +This environment focuses on message routing and coordination rather than traditional state/action spaces seen in other environments. + +## Usage + +```python +from metagpt.environment.mgx.mgx_env import MGXEnv +from metagpt.roles.di.team_leader import TeamLeader +from metagpt.schema import Message +from metagpt.roles import ( + Architect, + Engineer, + ProductManager, + ProjectManager, + QaEngineer, +) + +env = MGXEnv() + +env.add_roles( + [ + TeamLeader(), + ProductManager(), + Architect(), + ProjectManager(), + Engineer(n_borg=5, use_code_review=True), + QaEngineer(), + ] + ) +requirement = "create a 2048 game" +tl = env.get_role("Mike") +env.publish_message(Message(content=requirement, send_to=tl.name)) +await tl.run() +``` \ No newline at end of file diff --git a/src/en/guide/in_depth_guides/environment/werewolf.md b/src/en/guide/in_depth_guides/environment/werewolf.md index c9d784b1..1c2fb134 100644 --- a/src/en/guide/in_depth_guides/environment/werewolf.md +++ b/src/en/guide/in_depth_guides/environment/werewolf.md @@ -12,7 +12,7 @@ Definition: ```python from gymnasium import spaces -from metagpt.environment.werewolf.const import STEP_INSTRUCTIONS +from metagpt.environment.werewolf.werewolf_ext_env import STEP_INSTRUCTIONS space = spaces.Dict( { diff --git a/src/en/guide/tutorials/agent_101.md b/src/en/guide/tutorials/agent_101.md index c9e13aa7..04be45bb 100644 --- a/src/en/guide/tutorials/agent_101.md +++ b/src/en/guide/tutorials/agent_101.md @@ -12,20 +12,34 @@ Import any role, initialize it, run it with a starting message, done! ```python import asyncio -from metagpt.context import Context from metagpt.roles.product_manager import ProductManager from metagpt.logs import logger +from metagpt.schema import Message async def main(): - msg = "Write a PRD for a snake game" - context = Context() # The session Context object is explicitly created, and the Role object implicitly shares it automatically with its own Action object - role = ProductManager(context=context) - while msg: - msg = await role.run(msg) - logger.info(str(msg)) + # 1. Create ProductManager instance + pm = ProductManager( + name="Alice", # Use default name or customize + use_fixed_sop=True, # Enable fixed Standard Operating Procedure mode + ) + + # 2. Prepare user requirement + requirement = "Write a PRD for a snake game" + + # 3. Create requirement message + requirement_msg = Message( + content=requirement, + role="user" + ) + + # 4. Run ProductManager to get PRD + result = await pm.run(with_message=requirement_msg) + + logger.info(result) if __name__ == '__main__': asyncio.run(main()) + ``` ## Develop your first agent diff --git a/src/en/guide/tutorials/atomic_roles/intro.md b/src/en/guide/tutorials/atomic_roles/intro.md new file mode 100644 index 00000000..560a0a9f --- /dev/null +++ b/src/en/guide/tutorials/atomic_roles/intro.md @@ -0,0 +1,40 @@ +# RoleZero Architecture Design Specification + +## **Background: Evolution from SOPs to a General Agent Framework** + +In traditional agent frameworks, **Standard Operating Procedures (SOPs)** serve as the core solution for addressing specific scenarios. For example, in a software development environment, SOPs strictly define the code directory structure, data interaction formats, and task execution sequences. However, these SOPs have significant drawbacks: + +1. **Strong scenario dependency**: SOPs are highly coupled with specific business scenarios, making them difficult to adapt to other domains (e.g., healthcare, finance). +1. **Poor scalability**: Adding new business requirements necessitates custom development, leading to high development costs and low iteration efficiency. +1. **Weak fault tolerance**: If the process is interrupted, it cannot resume from the breakpoint and must restart from the beginning. + +For example, in a software company, SOPs require agents to interact with data in a fixed directory structure. However, third-party projects may use different structures, rendering the agent incompatible. Therefore, a **modular and generalized** framework is needed to decouple processes from scenarios, enhancing the agent's adaptability. + +## **Objective: Building Core Capabilities for a General Agent** + +The goal of RoleZero is to **overcome the limitations of SOPs through atomic functional elements and dynamic process orchestration**, achieving the following capabilities: + +1. **Flexible process orchestration**: Solve business problems dynamically using `think->action loops` or **chained atomic units** without custom development. +1. **Breakpoint recovery**: Resume tasks from the last successful node in case of an exception. +1. **Seamless business integration**: Support cross-domain collaboration (e.g., software company SOPs directly modifying third-party code) without additional development. + +## **Core Capabilities of RoleZero** + +As a general template for agents, RoleZero covers the entire lifecycle of intelligent agents: + +1. **Data Understanding (ENV/IO)** : Dynamically parse the structure and semantics of environmental inputs (e.g., code, documents). + +2. **Observation (Observe)** : Filter and format key data from the environment (ENV) for decision-making. + +3. **Thinking (Think)** : Dynamically generate or adjust task plans, supporting four types of decision logic: + + - **Task decomposition**: Break down ambiguous goals into atomic subtasks (e.g., "Develop login feature" → Design API → Write code → Test). + - **Task retry**: Adjust task constraints based on error feedback (e.g., add code format checks). + - **Process progression**: Mark the current task as complete and trigger the next task. + - **Human assistance**: Seek user clarification when unable to make decisions (e.g., asking for additional data or seeking user suggestions in case of errors or uncertainty). + +4. **Execution (Act)** : Call tools to execute atomic tasks, supporting experience reuse and context injection. + +5. **Memory (Memory)** : Store task states and historical data. + +6. **Evaluation (Evaluate)** : Dynamically verify task results. diff --git a/src/en/guide/tutorials/atomic_roles/role_zero.md b/src/en/guide/tutorials/atomic_roles/role_zero.md new file mode 100644 index 00000000..79c90abc --- /dev/null +++ b/src/en/guide/tutorials/atomic_roles/role_zero.md @@ -0,0 +1,322 @@ +# RoleZero + +## RoleZero Core Concepts + +`RoleZero` is a role in the MetaGPT system. It inherits from the base role class `Role` and is used to implement an agent that can think and act dynamically. The main function of the `RoleZero` class is to provide a flexible framework for the agent, so that it can dynamically select and execute tasks based on the information in the environment. Specifically, `RoleZero` can: + +**Dynamic thinking and decision-making**: Determine which operation to perform next by calling LLM (Large Language Model), and update its own state based on the current context, history, and memory information. + +**Execute tool instructions**: `RoleZero` has built-in calls to tools such as browsers (`Browser`) and editors (`Editor`). The instructions are associated with the corresponding tool methods through a mapping mechanism to realize the automatic execution of functions such as web page operations and file operations. + +**Planning and Task Management**: Built-in `Planner` module, which can plan, decompose and track the status of tasks, and support the thinking-action cycle similar to the ReAct model. + +**Interacting with people**: Provides ask_human and reply_to_human methods. When the agent encounters difficult problems or needs human assistance, it can actively ask or reply to humans. + +## RoleZero role operation mechanism + +![Operation Mechanism](/public/image/guide/tutorials/role_zero.png) + +## Code Encapsulation and Detailed Explanation + +The code encapsulation of `RoleZero` implements the design concept of modularization and clear responsibilities. The following is an explanation of each functional module: + +**1. Basic inheritance and tool registration** + +- `RoleZero` inherits from the base class `Role`, and reuses the memory, context, message processing and other functions of the base role. + +- Expose itself as a callable tool through the `@register_tool(include_functions=["ask_human", "reply_to_human"])` decorator. This allows other modules in the system to access the interactive interface of `RoleZero` through a unified tool registration mechanism. + +**2. Initialization and Validation (Validator)** + +- In the model validator (such as `set_plan_and_tool` and `set_tool_execution`), `RoleZero` initializes the internal state of the role after instantiation: + +- Planning and tool settings: through `set_plan_and_tool`, initialize `Planner`, set the reaction mode to **react** and make corresponding settings; at the same time, through `set_tool_execution`, construct a tool execution map (tool_execution_map) and register the corresponding methods of external tools (such as browsers, editors, terminals), so that when executing commands, the corresponding tool functions can be called through mapping. + +- Long-term Memory: In `set_longterm_memory`, the configuration conditions determine whether to enable long-term memory, which facilitates saving and retrieving historical information in multiple dialogue rounds. + +**3. Thinking and Action Loop** + +- **\_think method**: `RoleZero` overrides the `_think` method to construct a prompt (`prompt`) based on the current state, memory, and historical dialogue records, use `LLM` to determine the next action, and update the internal state (for example, call `_set_state` to set the next action to be performed). + +- **\_act and react methods**: + +- The `_act` method is responsible for executing the action determined by the `_think` method, processing the instructions returned by `LLM`, and calling the corresponding tools. + +- The `_react` method is a typical thinking-action loop: multiple calls to `_think` and `_act` until the set upper limit is reached, thus forming a complete reaction closed loop. + +- **Quick thinking mode**: In the `_quick_think` method, `RoleZero` provides a quick response capability. When it is found that the current message may not require a complete cycle, the answer is quickly generated to save the LLM call cost. + +**4. Command parsing and execution** + +- Command parsing: The method `_parse_commands` is used to parse the text results returned by LLM into a command list in JSON format, and perform error repair, such as handling JSON format errors, escape character problems, etc., to ensure that subsequent commands can be executed correctly. + +- Duplicate check: In the `_check_duplicates` method, by detecting whether there are similar answers in recent memory, avoid repeatedly generating the same response, and call the human interaction interface for help when necessary. + +- Command execution: Through the `_run_commands` method, traverse the parsed commands, and call the corresponding tool method according to the priority or special command rules. After executing the command, the results are summarized to form the agent's final answer. + +**5. Human Interface** + +- `ask_human` and `reply_to_human` methods encapsulate how to ask or reply to users when the system encounters complex problems or repeated errors. These methods usually determine the current environment type (for example, whether it belongs to MGXEnv), and then decide whether to actually hand it over to human intervention. + +## Customizing the RoleZero Role Process + +1. ### Registering Tools + +One of the most important features of `RoleZero` is the dynamic operation of tools. If the functions or classes you want to use LLM have not been registered as tools, you need to register them as tools first. After registration, `ToolRegistry` and `ToolRecommender` will play a role, presenting the tool's signature, `docstring`, etc. to `Agent` for selection and decision + +2. ### Fill in initialization parameters + +- `tools` fill in the name of the tool to be used. If all functions in the entire class are used, fill in `class name`. If some functions in a class are used, fill in `class name. function name`; in the `_think` stage, all tools specified by `tools` will be presented for LLM decision selection + +```python +@register_tool(include_functions=["ask_human", "reply_to_human"]) +class RoleZero(Role): + """A role who can think and act dynamically""" + + # Basic Info + name: str = "Zero" + profile: str = "RoleZero" + goal: str = "" # Describe the role responsibilities to facilitate TL to assign tasks + system_msg: list[str] = None # Use None to conform to the default value at llm.aask + cmd_prompt: str = CMD_PROMPT # Used to determine the command generated by the current step_think + instruction: str = ROLE_INSTRUCTION # Role-specific logic will be filled in cmd_prompt as a paragraph. For simpler roles, just change instruction. Otherwise, change cmd_prompt + + # React Mode + react_mode: Literal["react"] = "react" + max_react_loop: int = 50 # used for react mode + + # Tools + tools: list[str] = [] # Use special symbol [""] to indicate use of all registered tools + tool_recommender: Optional[ToolRecommender] = None + tool_execution_map: Annotated[dict[str, Callable], Field(exclude=True)] = {} + special_tool_commands: list[str] = ["Plan.finish_current_task", "end", "Terminal.run_command", "RoleZero.ask_human"] + # List of exclusive tool commands. + # If multiple instances of these commands appear, only the first occurrence will be retained. + exclusive_tool_commands: list[str] = [ + "Editor.edit_file_by_replace", + "Editor.insert_content_at_line", + "Editor.append_file", + "Editor.open_file", + ] + # Equipped with three basic tools by default for optional use + editor: Editor = Editor(enable_auto_lint=True) + browser: Browser = Browser() + + # Experience + experience_retriever: Annotated[ExpRetriever, Field(exclude=True)] = DummyExpRetriever() + + # Others + observe_all_msg_from_buffer: bool = True + command_rsp: str = "" # the raw string containing the commands + commands: list[dict] = [] # commands to be executed + memory_k: int = 200 # number of memories (messages) to use as historical context + use_fixed_sop: bool = False + respond_language: str = "" # Language for responding humans and publishing messages. + use_summary: bool = True # whether to summarize at the end + +``` + +3. ### Define the mapping from tool name to tool function + +Rewrite `_update_tool_execution`. This step mainly specifies how the command generated by the role corresponds to the function to be executed + +```python + @model_validator(mode="after") + def set_tool_execution(self) -> "RoleZero": + # default map + self.tool_execution_map = { + "Plan.append_task": self.planner.plan.append_task, + "Plan.reset_task": self.planner.plan.reset_task, + "Plan.replace_task": self.planner.plan.replace_task, + "Editor.write": self.editor.write, + "Editor.write_content": self.editor.write_content, + "Editor.read": self.editor.read, + "RoleZero.ask_human": self.ask_human, + "RoleZero.reply_to_human": self.reply_to_human, + } + # can be updated by subclass + self._update_tool_execution() + return self + + def _update_tool_execution(self): + pass +``` + +## **`SimpleReviewAssistant` role case analysis** + +In the implementation of the `SimpleReviewAssistant` role, the role is designed as a simple automated review generation assistant that can use the `GeneratePositiveReview` tool to generate positive reviews for products, stores, or services. This role inherits from `RoleZero` and registers the necessary tools to enable it to have functions such as a browser and a review generation tool. + +```python +class SimpleReviewAssistant(RoleZero): + """Rating Assistant helps users automatically generate positive reviews for products, stores or services""" + + name: str = "SimpleReviewAssistant" + profile: str = "Automated Positive Review Generator" + goal: str = "Generate positive reviews for your product, store or service." + tools: list[str] = ["RoleZero", Browser.__name__, "GeneratePositiveReview"] + + instruction: str = "Use GeneratePositiveReview tool to generate a positive review for a given product, store or service." + + def _update_tool_execution(self): + review_generator = GeneratePositiveReview() + self.tool_execution_map.update(tool2name(GeneratePositiveReview, ["run"], review_generator.run)) +``` + +### **1. `Action` configuration and execution** + +`SimpleReviewAssistant` relies on the `GeneratePositiveReview` `Action` to complete the automatic review generation task. + +```python +@register_tool(include_functions=["run"]) +class GeneratePositiveReview(Action): + """Generates a positive review for a product, store, or service.""" + + name: str = "GeneratePositiveReview" + input_args: Optional[BaseModel] = Field(default=None, exclude=True) + + PROMPT_TEMPLATE: str = """ + You are a professional product reviewer, and your task is to write a positive review for the following item: + + Item Type: {category} + Item Name: {item_name} + + Review Guidelines: + - Use a friendly, engaging, and positive tone. + - Highlight key advantages such as quality, value for money, experience, or convenience. + - Add a touch of personal experience to make the review more authentic. + - Ensure the review fits the intended platform, such as an e-commerce site (Amazon, eBay, Shopify), a food delivery service (UberEats, DoorDash, Meituan), or a local store/service. + + Examples of Positive Reviews: + + E-commerce Product (Electronics): + - "The {item_name} is absolutely fantastic! The build quality is excellent, and the performance exceeded my expectations. Battery life is great, and the sleek design makes it super stylish. Highly recommend!" + + Restaurant (Food Delivery - Meituan, UberEats, Yelp): + - "I ordered from {item_name}, and the food was delicious! Fresh ingredients, perfect seasoning, and fast delivery. The packaging was neat, and the portion size was generous. Will definitely order again!" + + Local Store (Retail, Clothing, Cosmetics): + - "Shopping at {item_name} was a wonderful experience! The store was well-organized, the staff was friendly, and the product selection was amazing. Prices were fair, and I found exactly what I needed!" + + Service (Salon, Repair, Cleaning, etc.): + - "I booked a service at {item_name}, and I’m beyond satisfied! The staff was professional, punctual, and highly skilled. Everything was handled smoothly, and I felt valued as a customer. Highly recommend!" + + Please generate a 50-100 word review following these examples: + """ + + async def run( + self, + with_messages: List[Message] = None, + *, + item_name: str = "This product", + category: str = "General", + **kwargs, + ) -> Union[AIMessage, str]: + """ + Generates a positive review for a product, store, or service. + + Args: + item_name (str): The name of the product, store, or service (default: "This product"). + category (str): The category of the item (e.g., "Electronics", "Restaurant", "Service"). + + Returns: + AIMessage: A well-crafted positive review. + + Example: + >>> action = GeneratePositiveReview() + >>> result = await action.run(item_name="Wireless Earbuds", category="Electronics") + >>> print(result) + AIMessage(content="These wireless earbuds are fantastic! The sound quality is crisp, the fit is comfortable, and the battery lasts forever. Highly recommend!") + """ + if not item_name: + return AIMessage(content="Please provide an item name for the review.", cause_by=self) + + # Fill the prompt with user inputs + prompt = self.PROMPT_TEMPLATE.format(item_name=item_name, category=category) + + # Generate a review using LLM + generated_review = await self._aask(prompt) + + return AIMessage(content=generated_review, cause_by=self) + +``` + +**What `GeneratePositiveReview` does** + +- `GeneratePositiveReview` is an `Action` that is registered to `tool_registry` and can be called from the `SimpleReviewAssistant` role. +- Its `run()` method uses LLM to generate a positive review that conforms to the `PROMPT_TEMPLATE` specification. +- The `run` method receives `item_name` (product/store/service name) and `category` (category), then fills in `PROMPT_TEMPLATE`, sends a request to LLM, and finally returns the AI-generated review. +- Tool methods need to strictly explain their functions, parameter meanings, and calling methods, which helps roles dynamically select this tool and generate corresponding parameters for calling. + +--- + +### **2. Role Tools (`Tools`) Configuration** + +In the `SimpleReviewAssistant` role definition, multiple tools are registered, including: + +```python +tools: list[str] = ["RoleZero", Browser.__name__, "GeneratePositiveReview"] +``` + +- **`RoleZero`**: The inherited basic role framework. +- **`Browser`**: Provides web search capabilities (such as obtaining product review data). +- **`GeneratePositiveReview`**: An `Action` for automatically generating positive reviews. + +These tools are managed internally by `RoleZero` and can be registered through `_update_tool_execution()`, so that the role can correctly call `GeneratePositiveReview`: + +```python + +def _update_tool_execution(self): + review_generator = GeneratePositiveReview() + self.tool_execution_map.update(tool2name(GeneratePositiveReview, ["run"], review_generator.run)) +``` + +You can also register `GeneratePositiveReview` to `tool_execution_map` through the `tool2name` method, so that when the role executes the task, it can directly call the `run()` method of `GeneratePositiveReview` to complete the generation of positive reviews. + +--- + +### **3. `MGXEnv` operation process** + +```python +async def run_on_mgx_env(): + mgx_env = MGXEnv() + ra = SimpleReviewAssistant() + msg = Message(content="Write a good review for airpods pro2") + mgx_env.add_roles([TeamLeader(), ra]) + mgx_env.publish_message(msg) + + start_time = time.time() + while time.time() - start_time < 15: + if not mgx_env.is_idle: + ret = await mgx_env.run() + logger.debug(ret) + start_time = time.time() +``` + +**Execution process analysis** + +1. **Initialize the `MGXEnv` runtime environment** and create the `SimpleReviewAssistant` role (`ra`). + +2. **Add roles** (`TeamLeader()` and `SimpleReviewAssistant()`). + +3. **Publish tasks**, such as `"Write a good review for airpods pro2"`. + +4. **Loop to check the `MGXEnv` status**: + + - If `MGXEnv` is active, run the task. + - The role dynamically decides to select the appropriate tool for execution based on the task and tool information. + +### **4. Summary** + +`RoleZero` and its subclass `SimpleReviewAssistant` enable the role to have dynamic task execution capabilities while ensuring its scalability through \*\*flexible configuration of tools (`Tools`) and tasks (`Action`). + +- **Tools (`Tools`) configuration** + +- The role can directly register tools such as `Browser`, `Editor`, `SearchEnhancedQA`, etc., and can enhance functions without additional development. + +- The tool calling method is flexible, supporting direct specification of tool `class name` and `class name.method name` mapping, which facilitates the role to perform complex tasks. + +- **Task (`Action`) configuration and execution** + +- For `Action` or custom tool methods, `_update_tool_execution()` can be overridden, and `Action` can be converted to a tool through the `tool2name` method, and mapped to `tool_execution_map` for registration, making `RoleZero` highly scalable and flexible. + +This design not only ensures the controllability of the SOP method, but also allows the role to dynamically adjust tools and tasks according to specific scenarios, achieving more intelligent task processing. diff --git a/src/public/image/guide/tutorials/role_zero.png b/src/public/image/guide/tutorials/role_zero.png new file mode 100644 index 00000000..b03f71fc --- /dev/null +++ b/src/public/image/guide/tutorials/role_zero.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ac825fc390cdba2fe2d93385fb45f7392ecad9fe9a15fb94f8dd117dd986fccd +size 607231 diff --git a/src/zh/guide/get_started/configuration/llm_api_configuration.md b/src/zh/guide/get_started/configuration/llm_api_configuration.md index efb09ccf..b147b621 100644 --- a/src/zh/guide/get_started/configuration/llm_api_configuration.md +++ b/src/zh/guide/get_started/configuration/llm_api_configuration.md @@ -41,7 +41,7 @@ llm: 可以用于初始化LLM,由于o1系列使用有些限制,出现问题可以及时反馈我们。 -现在您可以开始使用了!请参阅[快速入门](./quickstart)或我们的[教程](/guide/tutorials/agent_101)以进行第一次运行! +现在您可以开始使用了!请参阅[快速入门](./quickstart.md)或我们的[教程](/guide/tutorials/agent_101.md)以进行第一次运行! MetaGPT还支持各种LLM模型,根据您的需求配置模型API的密钥。 diff --git a/src/zh/guide/in_depth_guides/agent_communication.md b/src/zh/guide/in_depth_guides/agent_communication.md index 8e9a8d1b..03117429 100644 --- a/src/zh/guide/in_depth_guides/agent_communication.md +++ b/src/zh/guide/in_depth_guides/agent_communication.md @@ -15,6 +15,8 @@ class Message(BaseModel): cause_by: str = Field(default="", validate_default=True) sent_from: str = Field(default="", validate_default=True) send_to: set[str] = Field(default={MESSAGE_ROUTE_TO_ALL}, validate_default=True) + metadata: Dict[str, Any] = Field(default_factory=dict) # metadata for `content` and `instruct_content` + ``` 在规划智能体之间的消息转发流程时,首先要确定智能体的功能边界,这跟设计一个函数的套路一样: diff --git a/src/zh/guide/in_depth_guides/environment/intro.md b/src/zh/guide/in_depth_guides/environment/intro.md index 834e852b..69271418 100644 --- a/src/zh/guide/in_depth_guides/environment/intro.md +++ b/src/zh/guide/in_depth_guides/environment/intro.md @@ -16,7 +16,7 @@ 在`gymnasium`定义观察、动作空间时,一般是定义离散值或连续值。但在支持的这些场景环境中,由于更多是需要通过API或者接口访问游戏引擎服务或者外部模拟器。因此,对于动作空间(`gymnasium.spaces.Dict`),其包含不同的动作类型及不同动作下的所需入参的子空间定义。对于搜索空间(`gymnasium.spaces.Dict`),其包含可从环境中得到的环境信息,比如地图、屏幕截图等。 -`metagpt/environment/base_env_space.py`内的`BaseEnvActionType`定义了动作类型,`BaseEnvAction`定义了动作空间对应的一组取值,`BaseEnvObsType`定义了观察类型。 +`metagpt.base.base_env_space`内的`BaseEnvActionType`定义了动作类型,`BaseEnvAction`定义了动作空间对应的一组取值,`BaseEnvObsType`定义了观察类型。 一般的,`gymnasium`内得到的观察空间值是一组完整的观察值,但在实际应用中,往往需要从环境中得到局部观察值(比如在斯坦福小镇中,需要获取智能体所处位置视野范围内的地图信息,而非完整地图)。我们添加了`observe(self, obs_params: Optional[BaseEnvObsParams] = None)`方法来获取局部环境信息,`BaseEnvObsParams`定义了获取观察值的所需参数,包括观察类型和其所需入参。 ## 不同场景环境 @@ -28,3 +28,4 @@ - 已添加,[斯坦福小镇环境](./stanford_town.md) - 已添加,[安卓模拟器环境](./android.md) - 待添加,[网页环境](./web.md) +- 已添加,[MGX环境](./mgx.md) diff --git a/src/zh/guide/in_depth_guides/environment/mgx.md b/src/zh/guide/in_depth_guides/environment/mgx.md new file mode 100644 index 00000000..7bcb3586 --- /dev/null +++ b/src/zh/guide/in_depth_guides/environment/mgx.md @@ -0,0 +1,98 @@ +# MGX环境 + +[代码入口](https://github.com/geekan/MetaGPT/tree/main/metagpt/environment/mgx/mgx_env.py) +MGXEnv 是一个通用的多智能体协作环境,提供了一个灵活而强大的交互框架。在初始化时,环境支持自定义配置多个不同角色的智能体,每个智能体都配备有特定的提示词系统来指导其行为和职责。环境的核心特征是其独特的消息管理机制:由 TeamLeader 作为中央协调者,统一管理所有消息的流转和分发。这种设计既确保了信息传递的有序性,又支持灵活的交互方式,包括公开对话和私密通信。 通过这种架构设计,MGXEnv 能够有效支持复杂的多智能体协作场景,使得不同角色能够根据各自的专业领域和任务要求,高效地完成分工协作。 + +## 空间定义 + +### 消息空间 + +MGXEnv 主要处理多智能体环境中的消息路由和发布。核心消息空间由 Message 类定义,结构如下: + +定义: +```python +from gymnasium import spaces + +space = { + "role": spaces.Text(16), # 消息角色类型 + "content": spaces.Text(1024), # 实际消息内容 + "sent_from": spaces.Text(32), # 发送者名称 + "send_to": spaces.Set(spaces.Text(32)), # 接收者名称集合 + "metadata": spaces.Dict(), # 额外元数据如图片 +} +``` + +消息空间组件说明: + +| 字段 | 说明 | 取值说明 | +|------|------|----------| +| role | 消息角色类型 | ["user", "assistant", "system"] 之一 | +| content | 实际消息内容 | 最大长度1024字符 | +| sent_from | 消息发送者名称 | 最大长度32字符 | +| send_to | 接收者名称集合 | 每个名称最大32字符 | +| metadata | 额外的消息元数据 | 包含可选字段(如图片)的字典 | + +消息示例: +```python +from metagpt.schema import Message + +Message( + role="assistant", + content="我已完成分析。", + sent_from="Alice", + send_to={"Mike", ""}, + metadata={"agent": "Emma"} +) +``` + +### 通信模式 + +环境支持两种通信模式: + +1. 公共聊天模式(默认) +- 所有消息对所有角色可见(send_to 包含 ) +- 由团队领导(Mike)协调消息流 +- 消息存储在环境历史记录中 + +2. 直接聊天模式 +- 当用户直接与特定角色对话时触发 +- 仅在用户和目标角色之间进行 +- 绕过团队领导 +- 消息是否发布给所有人取决于 is_public_chat 标志 + +这个环境主要关注消息的路由和协调,而不是其他环境中常见的状态/动作空间。 + + +## 使用 + +```python +from metagpt.environment.mgx.mgx_env import MGXEnv +from metagpt.roles.di.team_leader import TeamLeader +from metagpt.schema import Message +from metagpt.roles import ( + Architect, + Engineer, + ProductManager, + ProjectManager, + QaEngineer, +) + +env = MGXEnv() + +env.add_roles( + [ + TeamLeader(), + ProductManager(), + Architect(), + ProjectManager(), + Engineer(n_borg=5, use_code_review=True), + QaEngineer(), + ] + ) +requirement = "create a 2048 game" +tl = env.get_role("Mike") +env.publish_message(Message(content=requirement, send_to=tl.name)) +await tl.run() + +``` + diff --git a/src/zh/guide/in_depth_guides/environment/werewolf.md b/src/zh/guide/in_depth_guides/environment/werewolf.md index 4545b726..8397af7d 100644 --- a/src/zh/guide/in_depth_guides/environment/werewolf.md +++ b/src/zh/guide/in_depth_guides/environment/werewolf.md @@ -12,7 +12,7 @@ ```python from gymnasium import spaces -from metagpt.environment.werewolf.const import STEP_INSTRUCTIONS +from metagpt.environment.werewolf.werewolf_ext_env import STEP_INSTRUCTIONS space = spaces.Dict( { diff --git a/src/zh/guide/tutorials/agent_101.md b/src/zh/guide/tutorials/agent_101.md index a6cf45db..4b8198d3 100644 --- a/src/zh/guide/tutorials/agent_101.md +++ b/src/zh/guide/tutorials/agent_101.md @@ -11,17 +11,30 @@ # 可导入任何角色,初始化它,用一个开始的消息运行它,完成! import asyncio -from metagpt.context import Context from metagpt.roles.product_manager import ProductManager from metagpt.logs import logger +from metagpt.schema import Message async def main(): - msg = "Write a PRD for a snake game" - context = Context() # 显式创建会话Context对象,Role对象会隐式的自动将它共享给自己的Action对象 - role = ProductManager(context=context) - while msg: - msg = await role.run(msg) - logger.info(str(msg)) + # 1. ProductManager实例 + pm = ProductManager( + name="Alice", # 使用默认名称或自定义 + use_fixed_sop=True, # 启用固定SOP模式 + ) + + # 2. 准备用户需求 + requirement = "Write a PRD for a snake game" + + # 3. 创建需求消息 + requirement_msg = Message( + content=requirement, + role="user" + ) + + # 4. 运行ProductManager获取PRD + result = await pm.run(with_message=requirement_msg) + + logger.info(result.content[:100]) if __name__ == '__main__': asyncio.run(main()) diff --git a/src/zh/guide/tutorials/atomic_roles/intro.md b/src/zh/guide/tutorials/atomic_roles/intro.md new file mode 100644 index 00000000..2b2bea8d --- /dev/null +++ b/src/zh/guide/tutorials/atomic_roles/intro.md @@ -0,0 +1,43 @@ +# RoleZero 架构设计说明 + +## **背景:从 SOPs 到通用智能体框架的演进** + +在传统的智能体(Agent)框架中,**标准化操作流程(SOPs)**  是解决特定场景问题的核心方案。例如,在软件开发场景中,SOPs 会严格定义代码目录结构、数据交互格式和任务执行顺序。然而,这类 SOPs 存在显著问题: + +1. **场景捆绑性强**:SOPs 与具体业务场景高度耦合,难以迁移到其他领域(如医疗、金融)。 +1. **扩展性差**:新增业务需定制开发,导致开发成本高、迭代效率低。 +1. **容错能力弱**:流程中断后无法从断点恢复,需从头执行。 + +以软件公司为例,其 SOPs 要求智能体按固定目录交互数据,但第三方项目可能使用不同的结构,导致智能体无法适配。因此,亟需一种**原子化、通用化**的框架,解耦流程与场景,提升智能体的泛化能力。 + +## **目标:构建通用智能体的核心能力** + +RoleZero 的设计目标是 **通过原子化功能要素和动态流程编排**,解决 SOPs 的局限性,实现以下能力: + +1. **灵活流程编排**:通过 `think->action 循环` 或 **串联原子化单元** 动态解决业务问题,无需定制开发。 +1. **断点恢复**:任务异常时,可从最后一次成功节点恢复运行。 +1. **无缝业务对接**:支持跨领域协作(如软件公司 SOP 直接修改第三方代码),无需额外开发。 + +## **RoleZero 的核心能力** + +RoleZero 作为 Agent 的通用模板,覆盖智能体的全生命周期: + +1. 数据理解(ENV/IO):动态解析环境输入(如代码、文档)的结构和语义。 + +2. 观察(Observe):从环境(ENV)中筛选、格式化关键数据,作为决策依据。 + +3. 思考(Think):动态生成或调整任务计划,支持四类决策逻辑: + + - 任务拆解:将模糊目标分解为原子化子任务(如“开发登录功能” → 设计接口 → 编写代码 → 测试)。 + + - 任务重试:根据错误反馈调整任务约束(如增加代码格式检查)。 + + - 流程推进:标记当前任务完成,触发下一任务。 + + - 人类求助:在无法决策时发起用户澄清(如“如意图澄清让用户补充数据、出错或不确定让用户提供建议”)。 + +4. 执行(Act):调用工具执行原子化任务,支持经验复用和上下文注入。 + +5. 记忆(Memory):存储任务状态与历史数据 + +6. 评估(Evaluate):动态验证任务结果 diff --git a/src/zh/guide/tutorials/atomic_roles/role_zero.md b/src/zh/guide/tutorials/atomic_roles/role_zero.md new file mode 100644 index 00000000..b6b4b2b7 --- /dev/null +++ b/src/zh/guide/tutorials/atomic_roles/role_zero.md @@ -0,0 +1,322 @@ +# RoleZero + +## RoleZero 核心概念 + +`RoleZero` 是 MetaGPT 系统中的一个角色,它继承自基础角色类 `Role`,用于实现一个能够动态思考和行动的智能体(Agent)。`RoleZero` 类的主要作用是为智能体提供一个灵活的框架,使其能够根据环境中的信息动态地选择和执行任务。具体来说,`RoleZero` 能够: + +**动态思考和决策**:通过调用 LLM(大语言模型)来判断下一步该执行哪种操作,并根据当前的上下文、历史记录以及内存信息更新自己的状态。 + +**执行工具指令**:`RoleZero` 内置了对浏览器(`Browser`)、编辑器(`Editor`)等工具的调用,通过映射机制将指令与相应的工具方法关联起来,实现对网页操作、文件操作等功能的自动化执行。 + +**规划与任务管理**:内置了 `Planner` 模块,可以对任务进行规划、分解以及状态跟踪,支持类似 ReAct 模型的思考—行动循环。 + +**与人交互**:提供了 ask_human 和 reply_to_human 方法,当代理遇到疑难问题或需要人工辅助时,可主动向人类发问或进行回复。 + +## RoleZero角色运行机制 + +![Operation Mechanism](/public/image/guide/tutorials/role_zero.png) + +## 代码封装与详解 + +`RoleZero` 的代码封装贯彻了模块化、职责明确的设计思想,下面按功能模块逐一说明: + +**1. 基础继承与工具注册** + +- `RoleZero` 继承自基础类 `Role`,复用了基础角色的内存、上下文、消息处理等功能。 + +- 通过 `@register_tool(include_functions=["ask_human", "reply_to_human"])` 装饰器,将自身暴露为可调用的工具。这使得在系统中其他模块能够通过统一的工具注册机制访问 `RoleZero` 的交互接口。 + +**2. 初始化与验证(Validator)** + +- 在模型验证器(例如 `set_plan_and_tool` 和 `set_tool_execution`)中,`RoleZero` 实例化后对角色的内部状态进行初始化: + + - 规划与工具设置:通过 `set_plan_and_tool`,初始化 `Planner`,将反应模式(react mode)设为 **react** 并进行相应设置;同时通过 `set_tool_execution` 构造一个工具执行映射(tool_execution_map),将外部工具(如浏览器、编辑器、终端)的对应方法注册进来,这样一来在执行命令时,就能通过映射调用对应的工具功能。 + + - 长时记忆(`Long-term Memory`):在 `set_longterm_memory` 中,根据配置条件决定是否启用长时记忆,便于在多个对话轮次中保存和检索历史信息。 + +**3. 思考与行动循环** + +- **\_think 方法**:`RoleZero` 重写了 `_think` 方法,用于根据当前状态、记忆和历史对话记录构造提示(`prompt`),利用 `LLM` 决定下一步行动,并更新内部状态(例如调用 `_set_state` 设置下一步要执行的动作)。 + +- **\_act 与 react 方法**: + + - `_act` 方法负责执行由 `_think` 方法确定的行动,处理 `LLM` 返回的指令,并调用对应工具。 + + - `_react` 方法则是典型的思考—再行动循环:多次调用 `_think` 与 `_act`,直至达到设定的上限,从而形成一个完整的反应闭环。 + +- **快速思考模式**:在 `_quick_think` 方法中,`RoleZero` 提供了一种快速响应的能力,当发现当前消息可能无需完整循环时,快速生成回答以节省 LLM 调用成本。 + +**4. 命令解析与执行** + +- 命令解析:方法 `_parse_commands` 用于将 LLM 返回的文本结果解析为 JSON 格式的命令列表,并进行错误修复,如处理 JSON 格式错误、逃逸字符问题等,从而确保后续命令能够正确执行。 + +- 重复检查:在 `_check_duplicates` 方法中,通过检测近期记忆中是否已经存在相似的回答,避免重复生成相同的响应,并在必要时调用人工交互接口请求帮助。 + +- 命令执行:通过 `_run_commands` 方法,遍历解析得到的命令,并按优先级或特殊命令的规则调用相应工具方法,执行命令后将结果进行汇总,最终形成代理的最终回答。 + +**5. 与人交互接口** + +- `ask_human` 和 `reply_to_human` 方法封装了当系统遇到复杂问题或重复错误时,如何向使用者询问或回复。这些方法通常会判断当前的环境类型(例如是否属于 MGXEnv),从而决定是否实际交由人工干预。 + +## 自定义RoleZero角色的流程 + +1. ### 注册工具 + +`RoleZero` 的一个最主要特性即是动态操作工具,若希望LLM使用的函数或类尚未被注册为工具,则需要先将它们注册为工具。注册好后,`ToolRegistry` 和 `ToolRecommender` 会发挥作用,将工具的签名、`docstring` 等呈递给 `Agent`,供其选择决策 + +2. ### 填写初始化参数 + +- `tools` 填写要被使用的工具名,若整个类中全部函数都使用,填写`类名`,若一个类中部分函数使用,填写`类名.函数名` ;在 `_think` 阶段,`tools` 指定的全部工具都将被呈现供LLM决策选择 + +```python +@register_tool(include_functions=["ask_human", "reply_to_human"]) +class RoleZero(Role): + """A role who can think and act dynamically""" + + # Basic Info + name: str = "Zero" + profile: str = "RoleZero" + goal: str = "" # 描述角色职责,方便TL分配任务 + system_msg: list[str] = None # Use None to conform to the default value at llm.aask + cmd_prompt: str = CMD_PROMPT # 用于确定当前步_think生成的命令 + instruction: str = ROLE_INSTRUCTION # 角色特异性的逻辑,会作为一个段落填入cmd_prompt内,对于较简单的角色而言,改instruction即可,否则,可改cmd_prompt + + # React Mode + react_mode: Literal["react"] = "react" + max_react_loop: int = 50 # used for react mode + + # Tools + tools: list [ str ] = [] # Use special symbol [""] to indicate use of all registered tools 最关键的一处,指定角色持有哪些工具 + tool_recommender: ToolRecommender = None + tool_execution_map: dict[str, Callable] = {} + special_tool_commands: list[str] = ["Plan.finish_current_task", "end", "Terminal.run_command", "RoleZero.ask_human"] + # List of exclusive tool commands. + # If multiple instances of these commands appear, only the first occurrence will be retained. + exclusive_tool_commands: list[str] = [ + "Editor.edit_file_by_replace", + "Editor.insert_content_at_line", + "Editor.append_file", + "Editor.open_file", + ] + + # Equipped with three basic tools by default for optional use + editor: Editor = Editor() + browser: Browser = Browser() + + # Experience + experience_retriever: ExpRetriever = DummyExpRetriever() + + # Others + observe_all_msg_from_buffer: bool = True + command_rsp: str = "" # the raw string containing the commands + commands: list[dict] = [] # commands to be executed + memory_k: int = 200 # number of memories (messages) to use as historical context + use_fixed_sop: bool = False + respond_language: str = "" # Language for responding humans and publishing messages. + use_summary: bool = True # whether to summarize at the end + +``` + +3. ### 定义工具名到工具函数的映射 + +重写 `_update_tool_execution`,这一步主要是指定,角色生成的命令,怎样对应到要被执行的函数 + +```python + @model_validator(mode="after") + def set_tool_execution(self) -> "RoleZero": + # default map + self.tool_execution_map = { + "Plan.append_task": self.planner.plan.append_task, + "Plan.reset_task": self.planner.plan.reset_task, + "Plan.replace_task": self.planner.plan.replace_task, + "Editor.write": self.editor.write, + "Editor.write_content": self.editor.write_content, + "Editor.read": self.editor.read, + "RoleZero.ask_human": self.ask_human, + "RoleZero.reply_to_human": self.reply_to_human, + } + # can be updated by subclass + self._update_tool_execution() + return self + + def _update_tool_execution(self): + pass +``` + +## **`SimpleReviewAssistant` 角色案例解析** + +在 `SimpleReviewAssistant` 角色的实现中,该角色被设计为一个简单的自动化好评生成助手,它能够使用 `GeneratePositiveReview` 工具生成针对产品、商店或服务的正面评价。该角色继承自 `RoleZero`,并注册了必要的工具,使其具备浏览器、好评生成工具等功能。 + +```python +class SimpleReviewAssistant(RoleZero): + """Rating Assistant helps users automatically generate positive reviews for products, stores or services""" + + name: str = "SimpleReviewAssistant" + profile: str = "Automated Positive Review Generator" + goal: str = "Generate positive reviews for your product, store or service." + tools: list[str] = ["RoleZero", Browser.__name__, "GeneratePositiveReview"] + + instruction: str = "Use GeneratePositiveReview tool to generate a positive review for a given product, store or service." + + def _update_tool_execution(self): + review_generator = GeneratePositiveReview() + self.tool_execution_map.update(tool2name(GeneratePositiveReview, ["run"], review_generator.run)) +``` + +## **1. `Action` 配置与执行** + +`SimpleReviewAssistant` 依赖 `GeneratePositiveReview` 这一 `Action` 来完成自动化评论生成任务。 + +```python +@register_tool(include_functions=["run"]) +class GeneratePositiveReview(Action): + """Generates a positive review for a product, store, or service.""" + + name: str = "GeneratePositiveReview" + input_args: Optional[BaseModel] = Field(default=None, exclude=True) + + PROMPT_TEMPLATE: str = """ + You are a professional product reviewer, and your task is to write a positive review for the following item: + + Item Type: {category} + Item Name: {item_name} + + Review Guidelines: + - Use a friendly, engaging, and positive tone. + - Highlight key advantages such as quality, value for money, experience, or convenience. + - Add a touch of personal experience to make the review more authentic. + - Ensure the review fits the intended platform, such as an e-commerce site (Amazon, eBay, Shopify), a food delivery service (UberEats, DoorDash, Meituan), or a local store/service. + + Examples of Positive Reviews: + + E-commerce Product (Electronics): + - "The {item_name} is absolutely fantastic! The build quality is excellent, and the performance exceeded my expectations. Battery life is great, and the sleek design makes it super stylish. Highly recommend!" + + Restaurant (Food Delivery - Meituan, UberEats, Yelp): + - "I ordered from {item_name}, and the food was delicious! Fresh ingredients, perfect seasoning, and fast delivery. The packaging was neat, and the portion size was generous. Will definitely order again!" + + Local Store (Retail, Clothing, Cosmetics): + - "Shopping at {item_name} was a wonderful experience! The store was well-organized, the staff was friendly, and the product selection was amazing. Prices were fair, and I found exactly what I needed!" + + Service (Salon, Repair, Cleaning, etc.): + - "I booked a service at {item_name}, and I’m beyond satisfied! The staff was professional, punctual, and highly skilled. Everything was handled smoothly, and I felt valued as a customer. Highly recommend!" + + Please generate a 50-100 word review following these examples: + """ + + async def run( + self, + with_messages: List[Message] = None, + *, + item_name: str = "This product", + category: str = "General", + **kwargs, + ) -> Union[AIMessage, str]: + """ + Generates a positive review for a product, store, or service. + + Args: + item_name (str): The name of the product, store, or service (default: "This product"). + category (str): The category of the item (e.g., "Electronics", "Restaurant", "Service"). + + Returns: + AIMessage: A well-crafted positive review. + + Example: + >>> action = GeneratePositiveReview() + >>> result = await action.run(item_name="Wireless Earbuds", category="Electronics") + >>> print(result) + AIMessage(content="These wireless earbuds are fantastic! The sound quality is crisp, the fit is comfortable, and the battery lasts forever. Highly recommend!") + """ + if not item_name: + return AIMessage(content="Please provide an item name for the review.", cause_by=self) + + # Fill the prompt with user inputs + prompt = self.PROMPT_TEMPLATE.format(item_name=item_name, category=category) + + # Generate a review using LLM + generated_review = await self._aask(prompt) + + return AIMessage(content=generated_review, cause_by=self) + +``` + +### **`GeneratePositiveReview` 作用** + +- `GeneratePositiveReview` 是一个 `Action`,被注册到 `tool_registry`,并可以在 `SimpleReviewAssistant` 角色中调用。 +- 其 `run()` 方法会使用 LLM 生成符合 `PROMPT_TEMPLATE` 规范的正面评价。 +- `run` 方法接收 `item_name`(商品/店铺/服务名称)和 `category`(类别),然后填充 `PROMPT_TEMPLATE`,向 LLM 发送请求,最终返回 AI 生成的评论。 +- 工具方法需要严格说明其作用、参数含义以及调用方式等信息,这有助于角色动态选择这工具并生成相应的参数进行调用。 + +--- + +## **2. 角色工具 (`Tools`) 配置** + +在 `SimpleReviewAssistant` 角色定义中,注册了多个工具,包括: + +```python +tools: list[str] = ["RoleZero", Browser.__name__, "GeneratePositiveReview"] +``` + +- **`RoleZero`**:继承的基础角色框架。 +- **`Browser`**:提供网页搜索能力(如获取产品评论数据)。 +- **`GeneratePositiveReview`**:用于自动生成好评的 `Action`。 + +这些工具被 `RoleZero` 内部管理,并可通过 `_update_tool_execution()` 进行注册,使角色能够正确调用 `GeneratePositiveReview`: + +```python + +def _update_tool_execution(self): + review_generator = GeneratePositiveReview() + self.tool_execution_map.update(tool2name(GeneratePositiveReview, ["run"], review_generator.run)) +``` + +也可以通过 `tool2name` 方法将 `GeneratePositiveReview` 被注册到 `tool_execution_map`,这使得角色在执行任务时,可以直接调用 `GeneratePositiveReview` 的 `run()` 方法,完成好评的生成。 + +--- + +## **3. `MGXEnv` 运行流程** + +```python +async def run_on_mgx_env(): + mgx_env = MGXEnv() + ra = SimpleReviewAssistant() + msg = Message(content="Write a good review for airpods pro2") + mgx_env.add_roles([TeamLeader(), ra]) + mgx_env.publish_message(msg) + + start_time = time.time() + while time.time() - start_time < 15: + if not mgx_env.is_idle: + ret = await mgx_env.run() + logger.debug(ret) + start_time = time.time() +``` + +### **执行流程解析** + +1. **初始化 `MGXEnv` 运行环境**,并创建 `SimpleReviewAssistant` 角色 (`ra`)。 + +2. **添加角色**(`TeamLeader()` 和 `SimpleReviewAssistant()`)。 + +3. **发布任务**,例如 `"Write a good review for airpods pro2"`。 + +4. **循环检测 `MGXEnv` 状态**: + + - 若 `MGXEnv` 处于活跃状态,则运行任务。 + - 角色根据任务和工具信息动态决定选择合适的工具进行执行。 + +### **4. 总结** + +`RoleZero` 及其子类 `SimpleReviewAssistant` 通过 **工具 (`Tools`) 和任务 (`Action`) 的灵活配置**,使角色具备动态的任务执行能力,同时确保其可扩展性。 + +- **工具 (`Tools`) 配置** + + - 角色可直接注册 `Browser`、`Editor`、`SearchEnhancedQA` 等工具,无需额外开发即可增强功能。 + - 工具调用方式灵活,支持直接指定工具 `类名` 和 `类名.方法名` 方式映射,便于角色执行复杂任务。 + +- **任务 (`Action`) 配置与执行** + + - 对于 `Action` 或自定义工具方法,`_update_tool_execution()` 可被重写,通过 `tool2name` 方法将 `Action` 转换为工具,并映射到 `tool_execution_map` 进行注册,使 `RoleZero` 具备高度的可扩展性和灵活性。 + +这种设计不仅确保了 SOP 方式的可控性,也允许角色根据具体场景动态调整工具与任务,实现更智能化的任务处理。