Skip to content

Commit bc09cd3

Browse files
authored
docs: 添加 Google Colab 教程链接 (#161)
1 parent 7603fdd commit bc09cd3

10 files changed

Lines changed: 26 additions & 0 deletions

docs/en/notes/guide/agent/operator_assemble_line.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,9 @@ After the script is executed, the console will print:
111111
- **[Execution]**: Execution status.
112112

113113
#### 5. Practical Case: General Text Reasoning and Pseudo-Answer Generation
114+
115+
You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1W3Wb1sTyea1xDAGmVu3Tyn7fcvrsppAp?usp=sharing) we provide to run the program:
116+
114117
We have a `tests/test.jsonl` file, where each line contains a `"raw_content"` field. Our goal is: based on the general English text content of this field, first invoke the large language model to generate reasoning-based answers for the text content, then generate pseudo-answers by generating candidate answers in multiple rounds and selecting the optimal one through statistics, and finally output key fields such as the list of candidate answers, optimal pseudo-answer, corresponding reasoning processes, and typical correct reasoning examples. Therefore, we select the `ReasoningAnswerGenerator` and `ReasoningPseudoAnswerGenerator` operators to orchestrate the Pipeline.
115118

116119
The following is a complete configuration example:

docs/en/notes/guide/agent/operator_qa.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,8 @@ After the script is executed, the console behaves differently depending on the m
116116

117117
#### 4. Practical Case: Find Operators for "Data Cleaning"
118118

119+
You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1maDKWp-3zEQNScmL_S7MHUdUC1xyCIcK?usp=sharing) we provide to run the program:
120+
119121
Suppose you need to clean data when developing a Pipeline and want to know if there are ready-made operators in the DataFlow library for processing.
120122

121123
**Scenario Configuration**: We set it to one-time query mode and specify to save the results locally for viewing detailed parameters in the code later.

docs/en/notes/guide/agent/operator_write.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,8 @@ During script execution, the following key information will be output:
152152

153153
#### 4. Practical Case: Writing a Sentiment Analysis Operator
154154

155+
You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1oTkwMNwxMFGAe9rNtYCC47CQ9HxsA0uH?usp=sharing) we provide to run the program:
156+
155157
We have a log file `tests/test.jsonl` containing the field `"raw_content"`. We want to create an operator to perform sentiment analysis on the text content of this field.
156158

157159
**Configuration Example:**

docs/en/notes/guide/agent/pipeline_prompt.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,8 @@ After the script is executed, the console will print the generation process. You
126126

127127
#### 4. Practical Case: Reuse the ReasoningQuestionFilter to Write a Filter Prompt for Financial Questions
128128

129+
You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1cU5Eg6tuc7WVDG33tU9Wplza52e54kts?usp=sharing) we provide to run the program:
130+
129131
Suppose we want to reuse the `ReasoningQuestionFilter` operator in the system and turn it into a filter for financial domain questions. Open the script and modify the configuration as follows:
130132

131133
```python

docs/en/notes/guide/agent/pipeline_rec&refine.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,8 @@ After the script is executed, the console will print the execution logs and the
185185

186186
##### 4. Practical Case: Pre-training Data Cleaning Pipeline
187187

188+
You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1MMJxRpfYi7Zd-jc_pyhvM1Y2WoQXOFcu?usp=sharing) we provide to run the program:
189+
188190
Suppose we have pre-training data `tests/test.jsonl` containing dirty data, and we want to clean it to obtain high-quality data. Open the script and modify the configuration as follows:
189191

190192
**Scenario Configuration:**
@@ -300,6 +302,8 @@ python script/run_dfa_pipeline_refine.py
300302

301303
##### 3. Practical Case: Simplify the Pipeline
302304

305+
You can refer to the following tutorials for learning, and also use the sample of [Google Colab](https://colab.research.google.com/drive/1MMJxRpfYi7Zd-jc_pyhvM1Y2WoQXOFcu?usp=sharing) we provide to run the program:
306+
303307
Suppose the Pipeline generated in the previous step is too complex and contains redundant "cleaning" operators, and we want to remove them to simplify the Pipeline.
304308

305309
**Scenario Configuration:**

docs/zh/notes/guide/agent/operator_assemble_line.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,9 @@ python script/run_dfa_op_assemble.py
111111
- **[Execution]**: 执行情况。
112112

113113
#### 5. 实战 Case:通用文本推理与伪答案生成
114+
115+
你可以参考以下教程学习,也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1W3Wb1sTyea1xDAGmVu3Tyn7fcvrsppAp?usp=sharing)样例来运行:
116+
114117
我们有一个 `tests/test.jsonl` 文件,里面每行都有一个 `"raw_content"` 字段。我们希望:基于该字段的通用英文文本内容,先调用大语言模型针对文本内容生成推理式答案,再通过多轮生成候选答案并统计选优的方式生成伪答案,最终输出候选答案列表、最优伪答案、对应推理过程及典型正确推理示例等关键字段。所以我们选择 `ReasoningAnswerGenerator``ReasoningPseudoAnswerGenerator` 两个算子来编排 Pipeline。
115118

116119
以下是完整的配置示例:

docs/zh/notes/guide/agent/operator_qa.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,8 @@ python script/run_dfa_operator_qa.py
118118

119119
#### 4. 实战 Case:查找“清洗数据”的算子
120120

121+
你可以参考以下教程学习,也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1maDKWp-3zEQNScmL_S7MHUdUC1xyCIcK?usp=sharing)样例来运行:
122+
121123
假设您在开发 Pipeline 时遇到数据需要清洗,想知道 DataFlow 库里有没有现成的算子可以处理。
122124

123125
**场景配置:** 我们将其设置为单次查询模式,并指定将结果保存到本地,以便后续在代码中查看详细参数。

docs/zh/notes/guide/agent/operator_write.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,8 @@ python script/run_dfa_operator_write.py
132132

133133
#### 4. 实战 Case:编写一个情感分析算子
134134

135+
你可以参考以下教程学习,也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1oTkwMNwxMFGAe9rNtYCC47CQ9HxsA0uH?usp=sharing)样例来运行:
136+
135137
我们有一个日志文件 `tests/test.jsonl`,其中包含字段 `"raw_content"`。我们希望创建一个算子,对该字段的文本内容进行情感分析。
136138

137139
**配置示例:**

docs/zh/notes/guide/agent/pipeline_prompt.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,8 @@ python script/run_dfa_pipeline_prompt.py
126126

127127
#### 4. 实战 Case:复用ReasoningQuestionFilter过滤器,编写适用金融问题的过滤器提示词
128128

129+
你可以参考以下教程学习,也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1cU5Eg6tuc7WVDG33tU9Wplza52e54kts?usp=sharing)样例来运行:
130+
129131
假设我们想复用系统中的 `ReasoningQuestionFilter` 算子,让它变成为一个金融领域问题的过滤器。打开脚本修改如下配置:
130132

131133
```python

docs/zh/notes/guide/agent/pipeline_rec&refine.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,8 @@ python script/run_dfa_pipeline_recommend.py
170170

171171
##### 4. 实战 Case:预训练数据清洗流水线
172172

173+
你可以参考以下教程学习,也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1MMJxRpfYi7Zd-jc_pyhvM1Y2WoQXOFcu?usp=sharing)样例来运行:
174+
173175
假设我们有一个包含脏数据的预训练数据 `tests/test.jsonl`,我们希望清洗出一份高质量数据。打开脚本修改如下配置:
174176

175177
**场景配置:**
@@ -285,6 +287,8 @@ python script/run_dfa_pipeline_refine.py
285287

286288
##### 3. 实战 Case:简化流水线
287289

290+
你可以参考以下教程学习,也可以参考我们提供的[Google Colab](https://colab.research.google.com/drive/1MMJxRpfYi7Zd-jc_pyhvM1Y2WoQXOFcu?usp=sharing)样例来运行:
291+
288292
假设上一步生成的流水线太复杂,包含了多余的“清洗”算子,我们希望将其移除来简化 Pipeline。
289293

290294
**场景配置:**

0 commit comments

Comments
 (0)