docs: 添加 Google Colab 教程链接 (#161)

memoryforget · web-flow · commit bc09cd3fe312 · 2026-02-14T22:43:02.000+08:00
diff --git a/docs/en/notes/guide/agent/operator_assemble_line.md b/docs/en/notes/guide/agent/operator_assemble_line.md
@@ -111,6 +111,9 @@ After the script is executed, the console will print:
 - **[Execution]**: Execution status.
 
 #### 5. Practical Case: General Text Reasoning and Pseudo-Answer Generation
+
+You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1W3Wb1sTyea1xDAGmVu3Tyn7fcvrsppAp?usp=sharing) we provide to run the program:
+
 We have a `tests/test.jsonl` file, where each line contains a `"raw_content"` field. Our goal is: based on the general English text content of this field, first invoke the large language model to generate reasoning-based answers for the text content, then generate pseudo-answers by generating candidate answers in multiple rounds and selecting the optimal one through statistics, and finally output key fields such as the list of candidate answers, optimal pseudo-answer, corresponding reasoning processes, and typical correct reasoning examples. Therefore, we select the `ReasoningAnswerGenerator` and `ReasoningPseudoAnswerGenerator` operators to orchestrate the Pipeline.
 
 The following is a complete configuration example:
diff --git a/docs/en/notes/guide/agent/operator_qa.md b/docs/en/notes/guide/agent/operator_qa.md
@@ -116,6 +116,8 @@ After the script is executed, the console behaves differently depending on the m
 
 #### 4. Practical Case: Find Operators for "Data Cleaning"
 
+You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1maDKWp-3zEQNScmL_S7MHUdUC1xyCIcK?usp=sharing) we provide to run the program:
+
 Suppose you need to clean data when developing a Pipeline and want to know if there are ready-made operators in the DataFlow library for processing.
 
 **Scenario Configuration**: We set it to one-time query mode and specify to save the results locally for viewing detailed parameters in the code later.
diff --git a/docs/en/notes/guide/agent/operator_write.md b/docs/en/notes/guide/agent/operator_write.md
@@ -152,6 +152,8 @@ During script execution, the following key information will be output:
 
 #### 4. Practical Case: Writing a Sentiment Analysis Operator
 
+You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1oTkwMNwxMFGAe9rNtYCC47CQ9HxsA0uH?usp=sharing) we provide to run the program:
+
 We have a log file `tests/test.jsonl` containing the field `"raw_content"`. We want to create an operator to perform sentiment analysis on the text content of this field.
 
 **Configuration Example:**
diff --git a/docs/en/notes/guide/agent/pipeline_prompt.md b/docs/en/notes/guide/agent/pipeline_prompt.md
@@ -126,6 +126,8 @@ After the script is executed, the console will print the generation process. You
 
 #### 4. Practical Case: Reuse the ReasoningQuestionFilter to Write a Filter Prompt for Financial Questions
 
+You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1cU5Eg6tuc7WVDG33tU9Wplza52e54kts?usp=sharing) we provide to run the program:
+
 Suppose we want to reuse the `ReasoningQuestionFilter` operator in the system and turn it into a filter for financial domain questions. Open the script and modify the configuration as follows:
 
 ```python
diff --git a/docs/en/notes/guide/agent/pipeline_rec&refine.md b/docs/en/notes/guide/agent/pipeline_rec&refine.md
@@ -185,6 +185,8 @@ After the script is executed, the console will print the execution logs and the
 
 ##### 4. Practical Case: Pre-training Data Cleaning Pipeline
 
+You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1MMJxRpfYi7Zd-jc_pyhvM1Y2WoQXOFcu?usp=sharing) we provide to run the program:
+
 Suppose we have pre-training data `tests/test.jsonl` containing dirty data, and we want to clean it to obtain high-quality data. Open the script and modify the configuration as follows:
 
 **Scenario Configuration:**
@@ -300,6 +302,8 @@ python script/run_dfa_pipeline_refine.py
 
 ##### 3. Practical Case: Simplify the Pipeline
 
+You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1MMJxRpfYi7Zd-jc_pyhvM1Y2WoQXOFcu?usp=sharing) we provide to run the program:
+
 Suppose the Pipeline generated in the previous step is too complex and contains redundant "cleaning" operators, and we want to remove them to simplify the Pipeline.
 
 **Scenario Configuration:**
diff --git a/docs/zh/notes/guide/agent/operator_assemble_line.md b/docs/zh/notes/guide/agent/operator_assemble_line.md
@@ -111,6 +111,9 @@ python script/run_dfa_op_assemble.py
 - **[Execution]**: 执行情况。
 
 #### 5. 实战 Case：通用文本推理与伪答案生成
+
+你可以参考以下教程学习，也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1W3Wb1sTyea1xDAGmVu3Tyn7fcvrsppAp?usp=sharing)样例来运行：
+
 我们有一个 `tests/test.jsonl` 文件，里面每行都有一个 `"raw_content"` 字段。我们希望：基于该字段的通用英文文本内容，先调用大语言模型针对文本内容生成推理式答案，再通过多轮生成候选答案并统计选优的方式生成伪答案，最终输出候选答案列表、最优伪答案、对应推理过程及典型正确推理示例等关键字段。所以我们选择 `ReasoningAnswerGenerator` 和 `ReasoningPseudoAnswerGenerator` 两个算子来编排 Pipeline。
 
 以下是完整的配置示例：
diff --git a/docs/zh/notes/guide/agent/operator_qa.md b/docs/zh/notes/guide/agent/operator_qa.md
@@ -118,6 +118,8 @@ python script/run_dfa_operator_qa.py
 
 #### 4. 实战 Case：查找“清洗数据”的算子
 
+你可以参考以下教程学习，也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1maDKWp-3zEQNScmL_S7MHUdUC1xyCIcK?usp=sharing)样例来运行：
+
 假设您在开发 Pipeline 时遇到数据需要清洗，想知道 DataFlow 库里有没有现成的算子可以处理。
 
 **场景配置：** 我们将其设置为单次查询模式，并指定将结果保存到本地，以便后续在代码中查看详细参数。
diff --git a/docs/zh/notes/guide/agent/operator_write.md b/docs/zh/notes/guide/agent/operator_write.md
@@ -132,6 +132,8 @@ python script/run_dfa_operator_write.py
 
 #### 4. 实战 Case：编写一个情感分析算子
 
+你可以参考以下教程学习，也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1oTkwMNwxMFGAe9rNtYCC47CQ9HxsA0uH?usp=sharing)样例来运行：
+
 我们有一个日志文件 `tests/test.jsonl`，其中包含字段 `"raw_content"`。我们希望创建一个算子，对该字段的文本内容进行情感分析。
 
 **配置示例：**
diff --git a/docs/zh/notes/guide/agent/pipeline_prompt.md b/docs/zh/notes/guide/agent/pipeline_prompt.md
@@ -126,6 +126,8 @@ python script/run_dfa_pipeline_prompt.py
 
 #### 4. 实战 Case：复用ReasoningQuestionFilter过滤器，编写适用金融问题的过滤器提示词
 
+你可以参考以下教程学习，也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1cU5Eg6tuc7WVDG33tU9Wplza52e54kts?usp=sharing)样例来运行：
+
 假设我们想复用系统中的 `ReasoningQuestionFilter` 算子，让它变成为一个金融领域问题的过滤器。打开脚本修改如下配置：
 
 ```python
diff --git a/docs/zh/notes/guide/agent/pipeline_rec&refine.md b/docs/zh/notes/guide/agent/pipeline_rec&refine.md
@@ -170,6 +170,8 @@ python script/run_dfa_pipeline_recommend.py
 
 ##### 4. 实战 Case：预训练数据清洗流水线
 
+你可以参考以下教程学习，也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1MMJxRpfYi7Zd-jc_pyhvM1Y2WoQXOFcu?usp=sharing)样例来运行：
+
 假设我们有一个包含脏数据的预训练数据 `tests/test.jsonl`，我们希望清洗出一份高质量数据。打开脚本修改如下配置：
 
 **场景配置：**
@@ -285,6 +287,8 @@ python script/run_dfa_pipeline_refine.py
 
 ##### 3. 实战 Case：简化流水线
 
+你可以参考以下教程学习，也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1MMJxRpfYi7Zd-jc_pyhvM1Y2WoQXOFcu?usp=sharing)样例来运行：
+
 假设上一步生成的流水线太复杂，包含了多余的“清洗”算子，我们希望将其移除来简化 Pipeline。
 
 **场景配置：**