geekan · better629 · Mar 12, 2025 · Feb 6, 2025 · Feb 10, 2025 · Feb 13, 2025
diff --git a/.vitepress/config.mts b/.vitepress/config.mts
@@ -244,6 +244,16 @@ export default defineConfig({
                 collapsed: false,
                 items: [
                   { text: 'Concepts', link: 'tutorials/concepts.md' },
+                  {
+                    text: 'Atomic Agent',
+                    link: 'tutorials/atomic_roles/intro.md',
+                    items: [
+                      {
+                        text: 'RoleZero',
+                        link: 'tutorials/atomic_roles/role_zero.md',
+                      },
+                    ],
+                  },
                   { text: 'Agent 101', link: 'tutorials/agent_101.md' },
                   {
                     text: 'MultiAgent 101',
@@ -538,6 +548,16 @@ export default defineConfig({
                     text: '概念简述',
                     link: 'tutorials/concepts',
                   },
+                  {
+                    text: '原子化智能体',
+                    link: 'tutorials/atomic_roles/intro.md',
+                    items: [
+                      {
+                        text: 'RoleZero',
+                        link: 'tutorials/atomic_roles/role_zero.md',
+                      },
+                    ],
+                  },
                   {
                     text: '智能体入门',
                     link: 'tutorials/agent_101',

diff --git a/src/en/guide/get_started/configuration/llm_api_configuration.md b/src/en/guide/get_started/configuration/llm_api_configuration.md
@@ -1,4 +1,5 @@
 # LLM API Configuration
+ # LLM API Configuration
 
 After completing the installation, follow these steps to configure the LLM API, using the OpenAI API as an example. This process is similar for other LLM APIs.
 
@@ -38,7 +39,7 @@ llm:
 
 It can be used to initialize LLM. Due to some restrictions on the use of o1 series, problems can be reported to us in time.
 
-With these steps, your setup is complete. For starting with MetaGPT, check out the [Quickstart guide](./quickstart) or our [Tutorials](/en/guide/tutorials/agent_101).
+With these steps, your setup is complete. For starting with MetaGPT, check out the [Quickstart guide](./quickstart.md) or our [Tutorials](/en/guide/tutorials/agent_101.md).
 
 MetaGPT supports a range of LLM models. Configure your model API keys as needed.
 

diff --git a/src/en/guide/in_depth_guides/agent_communication.md b/src/en/guide/in_depth_guides/agent_communication.md
@@ -15,6 +15,8 @@ class Message(BaseModel):
     cause_by: str = Field(default="", validate_default=True)
     sent_from: str = Field(default="", validate_default=True)
     send_to: set[str] = Field(default={MESSAGE_ROUTE_TO_ALL}, validate_default=True)
+    metadata: Dict[str, Any] = Field(default_factory=dict)  # metadata for `content` and `instruct_content`
+
 ```
 
 When planning the message forwarding process between agents, it's essential to first determine the functional boundaries of the agents, similar to designing a function:

diff --git a/src/en/guide/in_depth_guides/environment/intro.md b/src/en/guide/in_depth_guides/environment/intro.md
@@ -12,6 +12,13 @@ In `ExtEnv`, we refer to the design of `gymnasium` in the reinforcement learning
 
 In addition, the decorators `mark_as_readable` and `mark_as_writeable` for the different `read-write` interfaces provided by `ExtEnv` are also provided to facilitate the unified management of method interfaces for external environment docking, so that subsequent agents can use them as A tool capability that can directly and automatically call different external environment docking interfaces based on the input natural language (this part of the function is to be opened).
 
+### Observation space, action space definition specification
+
+When defining observation and action space in `gymnasium`, discrete values or continuous values are generally defined. However, in the supported scene environments, game engine services or external simulators are more often accessed through APIs or interfaces. Therefore, for the action space (`gymnasium.spaces.Dict`), it contains subspace definitions of different action types and required input parameters under different actions. For the search space (`gymnasium.spaces.Dict`), it contains environmental information that can be obtained from the environment, such as maps, screenshots, etc.
+
+`BaseEnvActionType` in `metagpt.base.base_env_space` defines the action type, `BaseEnvAction` defines a set of values corresponding to the action space, and `BaseEnvObsType` defines the observation type.
+Generally, the observation space values obtained in `gymnasium` are a complete set of observation values, but in practical applications, it is often necessary to obtain local observation values from the environment (for example, in Stanford Town, it is necessary to obtain map information within the field of view of the agent's location, rather than the complete map). We have added the `observe(self, obs_params: Optional[BaseEnvObsParams] = None)` method to obtain local environment information. `BaseEnvObsParams` defines the parameters required to obtain observation values, including the observation type and its required input parameters.
+
 ## Different Environments
 
 Currently, we provide several scenario environments and provide corresponding scenario usage entrances under `MetaGPT/examples/`.
@@ -20,4 +27,5 @@ Currently, we provide several scenario environments and provide corresponding sc
 - Added, [Werewolf Environment](./werewolf.md)
 - Added, [Stanford Town Environment](./stanford_town.md)
 - Added, [Android Environment](./android.md)
+- Added, [MGXEnv Environment](./mgx.md)
 - ToBeAdded, [Web Environment](./web.md)
diff --git a/src/en/guide/in_depth_guides/environment/mgx.md b/src/en/guide/in_depth_guides/environment/mgx.md
@@ -0,0 +1,96 @@
+# MGX Environment
+
+[Code Entry](https://github.com/geekan/MetaGPT/tree/main/metagpt/environment/mgx/mgx_env.py)
+
+MGXEnv is a generic multi-agent collaboration environment that provides a flexible and powerful interaction framework. During initialization, the environment supports configuring multiple agents with different roles, each equipped with a specific prompt system to guide their behavior and responsibilities. The core feature of the environment is its unique message management mechanism: TeamLeader acts as a central coordinator, uniformly managing the flow and distribution of all messages. This design ensures both the orderliness of information transmission and supports flexible interaction methods, including public dialogue and private communication. Through this architectural design, MGXEnv can effectively support complex multi-agent collaboration scenarios, enabling different roles to efficiently complete division of labor and cooperation according to their respective professional fields and task requirements.
+
+## Space Definition
+
+### Message Space
+
+MGXEnv mainly handles message routing and publishing in a multi-agent environment. The core message space is defined by the Message class with the following structure:
+
+Definition:
+```python
+from gymnasium import spaces
+
+space = {
+    "role": spaces.Text(16),         # Message role type 
+    "content": spaces.Text(1024),    # Actual message content
+    "sent_from": spaces.Text(32),    # Sender name
+    "send_to": spaces.Set(spaces.Text(32)),  # Set of recipient names
+    "metadata": spaces.Dict(),       # Additional metadata like images
+}
+```
+
+Message Space Components:
+
+| Field | Description | Value Range |
+|-------|-------------|-------------|
+| role | Message role type | One of ["user", "assistant", "system"] |
+| content | Actual message content | Maximum length 1024 characters |
+| sent_from | Message sender name | Maximum length 32 characters |
+| send_to | Set of recipient names | Each name maximum 32 characters |
+| metadata | Additional message metadata | Dictionary containing optional fields (like images) |
+
+Message Example:
+```python
+from metagpt.schema import Message
+
+Message(
+    role="assistant",
+    content="Analysis completed.", 
+    sent_from="Alice",
+    send_to={"Mike", "<all>"},
+    metadata={"agent": "Emma"}
+)
+```
+
+### Communication Modes
+
+The environment supports two communication modes:
+
+1. Public Chat Mode (default)
+- All messages visible to all roles (send_to includes <all>)
+- Message flow coordinated by team leader (Mike)
+- Messages stored in environment history
+
+2. Direct Chat Mode 
+- Triggered when user directly messages a specific role
+- Communication only between user and target role
+- Bypasses team leader
+- Message publishing to all depends on is_public_chat flag
+
+This environment focuses on message routing and coordination rather than traditional state/action spaces seen in other environments.
+
+## Usage
+
+```python
+from metagpt.environment.mgx.mgx_env import MGXEnv
+from metagpt.roles.di.team_leader import TeamLeader
+from metagpt.schema import Message
+from metagpt.roles import (
+    Architect,
+    Engineer,
+    ProductManager,
+    ProjectManager,
+    QaEngineer,
+)
+
+env = MGXEnv()
+
+env.add_roles(
+        [
+            TeamLeader(),
+            ProductManager(),
+            Architect(),
+            ProjectManager(),
+            Engineer(n_borg=5, use_code_review=True),
+            QaEngineer(),
+        ]
+    )
+requirement = "create a 2048 game"
+tl = env.get_role("Mike")
+env.publish_message(Message(content=requirement, send_to=tl.name))
+await tl.run()
+```
diff --git a/src/en/guide/in_depth_guides/environment/werewolf.md b/src/en/guide/in_depth_guides/environment/werewolf.md
@@ -12,7 +12,7 @@ Definition:
 
 ```python
 from gymnasium import spaces
-from metagpt.environment.werewolf.const import STEP_INSTRUCTIONS
+from metagpt.environment.werewolf.werewolf_ext_env import STEP_INSTRUCTIONS
 
 space = spaces.Dict(
      {

diff --git a/src/en/guide/tutorials/agent_101.md b/src/en/guide/tutorials/agent_101.md
@@ -12,20 +12,34 @@ Import any role, initialize it, run it with a starting message, done!
 ```python
 import asyncio
 
-from metagpt.context import Context
 from metagpt.roles.product_manager import ProductManager
 from metagpt.logs import logger
+from metagpt.schema import Message
 
 async def main():
-    msg = "Write a PRD for a snake game"
-    context = Context()  # The session Context object is explicitly created, and the Role object implicitly shares it automatically with its own Action object
-    role = ProductManager(context=context)
-    while msg:
-        msg = await role.run(msg)
-        logger.info(str(msg))
+    # 1. Create ProductManager instance
+    pm = ProductManager(
+        name="Alice",  # Use default name or customize
+        use_fixed_sop=True,  # Enable fixed Standard Operating Procedure mode
+    )
+
+    # 2. Prepare user requirement
+    requirement = "Write a PRD for a snake game"
+
+    # 3. Create requirement message
+    requirement_msg = Message(
+        content=requirement,
+        role="user"
+    )
+
+    # 4. Run ProductManager to get PRD
+    result = await pm.run(with_message=requirement_msg)
+
+    logger.info(result)
 
 if __name__ == '__main__':
     asyncio.run(main())
+
 ```
 
 ## Develop your first agent

diff --git a/src/en/guide/tutorials/atomic_roles/intro.md b/src/en/guide/tutorials/atomic_roles/intro.md
@@ -0,0 +1,40 @@
+# RoleZero Architecture Design Specification
+
+## **Background: Evolution from SOPs to a General Agent Framework**
+
+In traditional agent frameworks, **Standard Operating Procedures (SOPs)** serve as the core solution for addressing specific scenarios. For example, in a software development environment, SOPs strictly define the code directory structure, data interaction formats, and task execution sequences. However, these SOPs have significant drawbacks:
+
+1.  **Strong scenario dependency**: SOPs are highly coupled with specific business scenarios, making them difficult to adapt to other domains (e.g., healthcare, finance).
+1.  **Poor scalability**: Adding new business requirements necessitates custom development, leading to high development costs and low iteration efficiency.
+1.  **Weak fault tolerance**: If the process is interrupted, it cannot resume from the breakpoint and must restart from the beginning.
+
+For example, in a software company, SOPs require agents to interact with data in a fixed directory structure. However, third-party projects may use different structures, rendering the agent incompatible. Therefore, a **modular and generalized** framework is needed to decouple processes from scenarios, enhancing the agent's adaptability.
+
+## **Objective: Building Core Capabilities for a General Agent**
+
+The goal of RoleZero is to **overcome the limitations of SOPs through atomic functional elements and dynamic process orchestration**, achieving the following capabilities:
+
+1.  **Flexible process orchestration**: Solve business problems dynamically using `think->action loops` or **chained atomic units** without custom development.
+1.  **Breakpoint recovery**: Resume tasks from the last successful node in case of an exception.
+1.  **Seamless business integration**: Support cross-domain collaboration (e.g., software company SOPs directly modifying third-party code) without additional development.
+
+## **Core Capabilities of RoleZero**
+
+As a general template for agents, RoleZero covers the entire lifecycle of intelligent agents:
+
+1.  **Data Understanding (ENV/IO)** : Dynamically parse the structure and semantics of environmental inputs (e.g., code, documents).
+
+2.  **Observation (Observe)** : Filter and format key data from the environment (ENV) for decision-making.
+
+3.  **Thinking (Think)** : Dynamically generate or adjust task plans, supporting four types of decision logic:
+
+    - **Task decomposition**: Break down ambiguous goals into atomic subtasks (e.g., "Develop login feature" → Design API → Write code → Test).
+    - **Task retry**: Adjust task constraints based on error feedback (e.g., add code format checks).
+    - **Process progression**: Mark the current task as complete and trigger the next task.
+    - **Human assistance**: Seek user clarification when unable to make decisions (e.g., asking for additional data or seeking user suggestions in case of errors or uncertainty).
+
+4.  **Execution (Act)** : Call tools to execute atomic tasks, supporting experience reuse and context injection.
+
+5.  **Memory (Memory)** : Store task states and historical data.
+
+6.  **Evaluation (Evaluate)** : Dynamically verify task results.