From 7ba3abdfc35e54ec00eee133462604a6479389fe Mon Sep 17 00:00:00 2001
From: wongzhenhao <zhenhao1141@gmail.com>
Date: Mon, 2 Feb 2026 16:55:43 +0800
Subject: [PATCH 1/3] delete mathquetion_extarct

---
 .../guide/quickstart/mathquestion_extract.md  | 187 -----------------
 .../guide/quickstart/mathquestion_extract.md  | 189 ------------------
 2 files changed, 376 deletions(-)
 delete mode 100644 docs/en/notes/guide/quickstart/mathquestion_extract.md
 delete mode 100644 docs/zh/notes/guide/quickstart/mathquestion_extract.md

diff --git a/docs/en/notes/guide/quickstart/mathquestion_extract.md b/docs/en/notes/guide/quickstart/mathquestion_extract.md
deleted file mode 100644
index bc80ca4b0..000000000
--- a/docs/en/notes/guide/quickstart/mathquestion_extract.md
+++ /dev/null
@@ -1,187 +0,0 @@
----
-title: Case 6. Math Problem Extraction
-createTime: 2025/07/16 20:10:28
-icon: teenyicons:receipt-outline
-permalink: /en/guide/t8ykcw9l/
----
-
-# Quick Start: Math Problem Extraction
-
-This example demonstrates how to use the `MathBookQuestionExtract` operator in Dataflow to automatically extract math problems from a textbook PDF and generate output in JSON/Markdown format.
-
-## 1 Environment and Dependencies
-
-1. Install Dataflow and MinerU dependencies  
-   ```shell
-   pip install "open-dataflow[mineru]"
-   ```  
-   Or install from source:  
-   ```shell
-   pip install -e ".[mineru]"
-   ```
-
-2. Download MinerU model weights  
-   ```shell
-   mineru-models-download
-   ```
-
-> This operator uses MinerU for PDF content segmentation and image extraction; please ensure that the installation and model weight download have succeeded.
-
-## 2 Configure LLM Serving
-
-This operator currently only supports API-based VLM Serving. Please configure the API URL and key before running.
-
-- Linux / macOS:  
-  ```shell
-  export DF_API_KEY="sk-xxxxx"
-  ```
-- Windows PowerShell:  
-  ```powershell
-  $env:DF_API_KEY = "sk-xxxxx"
-  ```
-
-The API key will be read from the environment variable in the code, so there is no need to hard-code it in the script.
-
-## 3 Prepare the Test PDF
-
-The example repository includes a test PDF:  
-```
-./dataflow/example/KBCleaningPipeline/questionextract_test.pdf
-```  
-You can also replace it with any math textbook or exercise collection PDF.
-
-## 4 Initialize and Modify the Script
-
-First, create a new `run_dataflow` folder anywhere, enter that directory, and then execute Dataflow project initialization:
-
-```shell
-mkdir run_dataflow
-cd run_dataflow
-dataflow init
-```
-
-After initialization is complete, the following file will appear in the project directory:
-
-```shell
-run_dataflow/playground/mathbook_extract.py
-```
-
-The contents of that script are as follows:
-
-```python
-from dataflow.operators.generate import MathBookQuestionExtract
-from dataflow.serving.APIVLMServing_openai import APIVLMServing_openai
-
-class QuestionExtractPipeline:
-    def __init__(self, llm_serving: APIVLMServing_openai):
-        self.extractor = MathBookQuestionExtract(llm_serving)
-        self.test_pdf = "../example/KBCleaningPipeline/questionextract_test.pdf"
-
-    def forward(
-        self,
-        pdf_path: str,
-        output_name: str,
-        output_dir: str,
-        api_url: str = "https://api.openai.com/v1/chat/completions",
-        key_name_of_api_key: str = "DF_API_KEY",
-        model_name: str = "o4-mini",
-        max_workers: int = 20
-    ):
-        self.extractor.run(
-            pdf_file_path=pdf_path,
-            output_file_name=output_name,
-            output_folder=output_dir,
-            api_url=api_url,
-            key_name_of_api_key=key_name_of_api_key,
-            model_name=model_name,
-            max_workers=max_workers
-        )
-
-if __name__ == "__main__":
-    # 1. Initialize LLM Serving
-    llm_serving = APIVLMServing_openai(
-        api_url="https://api.openai.com/v1/chat/completions",
-        model_name="o4-mini",      # It is recommended to use a strong reasoning model
-        max_workers=20             # Number of concurrent requests
-    )
-
-    # 2. Construct and run the extraction pipeline
-    pipeline = QuestionExtractPipeline(llm_serving)
-    pipeline.forward(
-        pdf_path=pipeline.test_pdf,
-        output_name="test_question_extract",
-        output_dir="./output"
-    )
-```
-
-### Key Parameter Explanation
-
-- `api_url`: OpenAI VLM endpoint URL  
-- `key_name_of_api_key`: Name of the environment variable  
-- `model_name`: Model name (e.g., `o4-mini`; strong reasoning models are recommended)  
-- `max_workers`: Number of concurrent requests  
-
-
-### Operator Logic
-
-The complete implementation of the operator is located at  
-`dataflow/operators/generate/KnowledgeCleaning/mathbook_question_extract.py`  
-Below, starting from the overall flow, we provide concise yet detailed explanations of each key stage to facilitate use and secondary development:
-
-1. PDF file splitting  
-   - Use `pymupdf` (fitz) to open the target PDF, rendering each page into a high-quality JPEG image at the specified DPI.  
-   - Save the images, named by page number, to the specified output directory, and log the conversion progress of each page to ensure traceability.
-
-2. Invoke MinerU for content recognition and image extraction  
-   - Dynamically import the `mineru` module; if it is not installed, throw a friendly prompt guiding the user to run `pip install mineru[pipeline]` and download the models.  
-   - Specify loading models from the local source via the environment variable `MINERU_MODEL_SOURCE=local`, supporting backend options `"vlm-sglang-engine"` or `"pipeline"`.  
-   - Execute the command-line tool:  
-```shell
-    mineru -p <pdf_file> -o <output_folder> -b <backend> --source local  
-```
-   - After execution, the tool will generate `*_content_list.json` (a structured content inventory) and a folder of the original split images in the intermediate directory.
-
-3. Organize and rename image resources  
-   - Read the `content_list.json` produced by MinerU, filtering out all items where `type=='image'`.  
-   - Copy the corresponding images from MinerU’s temporary directory to the final result folder, renaming them sequentially as `0.jpg, 1.jpg...`.  
-   - Also generate a new JSON inventory, recording each image’s page number in the source PDF and its new file path.
-
-4. Organize model invocation commands  
-   - Retrieve the predefined text prompt (`mathbook_question_extract_prompt`) from `dataflow.prompts.kbcleaning.KnowledgeCleanerPrompt`, specifying the task requirements and format conventions.  
-   - Package the rendered commands together with multiple input images (page snapshots, illustrations) to prepare for subsequent concurrent LLM service calls.
-
-5. Concurrently obtain model responses  
-   - Use `APIVLMServing_openai` (or another `LLMServingABC` implementation) combined with `ThreadPoolExecutor` to concurrently submit the packaged list of images and labels to the model.  
-   - Allow customization of the model name, API endpoint, concurrency level, and timeout to flexibly meet different performance and cost requirements.
-
-6. Parse and save the final output  
-   - In the `analyze_and_save` method, use regular expressions to precisely capture the `<image>index.jpg</image>` tags in the model’s returned text.  
-   - Copy the corresponding images referenced in the tags to the `images/` subfolder in the results directory.  
-   - Output the results in two formats:  
-     a. JSON file: sequentially store each question’s plain text (with tags removed) and the corresponding list of image paths  
-     b. Markdown file: embed images in the original text using the `![](images/xx.jpg)` format for easy visualization  
-   - All output files are saved in the user-specified result folder, facilitating subsequent verification and secondary use.
-
-## 5 Run the Script
-
-```shell
-python generate_question_extract_api.py
-```
-
-After it finishes, the `./output` directory will contain:
-
-- `test_question_extract.json`  
-  Each record includes:  
-  - `text`: Extracted problem text  
-  - `pics`: List of image paths involved in the problem  
-- `test_question_extract.md`  
-  Displays the problems and their images in Markdown format
-
-## 6 Optional Extensions
-
-- Custom prompts: To adjust the system prompt, replace it inside the operator:  
-  ```python
-  from dataflow.prompts.kbcleaning import KnowledgeCleanerPrompt
-  system_prompt = KnowledgeCleanerPrompt().mathbook_question_extract_prompt()
-  ```
-- Parameter customization: Supports switching the MinerU backend (`pipeline` | `vlm-sglang-engine`), adjusting DPI, concurrency, etc. See the `run` method signature in the operator.
\ No newline at end of file
diff --git a/docs/zh/notes/guide/quickstart/mathquestion_extract.md b/docs/zh/notes/guide/quickstart/mathquestion_extract.md
deleted file mode 100644
index 5a81d8540..000000000
--- a/docs/zh/notes/guide/quickstart/mathquestion_extract.md
+++ /dev/null
@@ -1,189 +0,0 @@
----
-title: 案例6. 数学问题提取
-createTime: 2025/07/16 20:10:28
-icon: teenyicons:receipt-outline
-permalink: /zh/guide/zchbl7uk/
----
-
-# 快速开始：数学问题提取
-
-本示例展示如何使用 Dataflow 中的 `MathBookQuestionExtract` 算子，自动从教材 PDF 中提取数学题目，并生成 JSON/Markdown 格式的输出。
-
-## 1 环境及依赖
-
-1. 安装 Dataflow 与 MinerU 依赖  
-   ```shell
-   pip install "open-dataflow[mineru]"
-   ```  
-   或者从源码安装：
-   ```shell
-   pip install -e ".[mineru]"
-   ```
-
-2. 下载 MinerU 模型权重  
-   ```shell
-   mineru-models-download
-   ```
-
-> 本算子基于 MinerU 实现 PDF 内容切分与图像抽取，请确保安装并下载模型权重成功。
-
-## 2 配置 LLM Serving
-
-当前算子仅支持基于 API 的 VLM Serving。请在运行前设置好 API 地址和 Key。
-
-- Linux / macOS：
-  ```shell
-  export DF_API_KEY="sk-xxxxx"
-  ```
-- Windows PowerShell：
-  ```powershell
-  $env:DF_API_KEY = "sk-xxxxx"
-  ```
-
-后续在代码中会通过环境变量读取该 API Key，无需在脚本中明文填写。
-
-## 3 准备测试 PDF
-
-示例仓库自带一份测试 PDF：  
-```
-./dataflow/example/KBCleaningPipeline/questionextract_test.pdf
-```  
-你也可以替换为任意数学教材或习题集 PDF。
-
-## 4 初始化并修改脚本
-
-首先，在任意位置创建一个新的 `run_dataflow` 文件夹，并进入该目录，然后执行 Dataflow 项目初始化：
-
-```shell
-mkdir run_dataflow
-cd run_dataflow
-dataflow init
-```
-
-初始化完成后，项目目录下会出现以下文件：
-
-```shell
-run_dataflow/playground/mathbook_extract.py
-```
-
-该脚本的内容如下：
-
-```python
-from dataflow.operators.generate import MathBookQuestionExtract
-from dataflow.serving.APIVLMServing_openai import APIVLMServing_openai
-
-class QuestionExtractPipeline:
-    def __init__(self, llm_serving: APIVLMServing_openai):
-        self.extractor = MathBookQuestionExtract(llm_serving)
-        self.test_pdf = "../example/KBCleaningPipeline/questionextract_test.pdf"
-
-    def forward(
-        self,
-        pdf_path: str,
-        output_name: str,
-        output_dir: str,
-        api_url: str = "https://api.openai.com/v1/chat/completions",
-        key_name_of_api_key: str = "DF_API_KEY",
-        model_name: str = "o4-mini",
-        max_workers: int = 20
-    ):
-        self.extractor.run(
-            pdf_file_path=pdf_path,
-            output_file_name=output_name,
-            output_folder=output_dir,
-            api_url=api_url,
-            key_name_of_api_key=key_name_of_api_key,
-            model_name=model_name,
-            max_workers=max_workers
-        )
-
-if __name__ == "__main__":
-    # 1. 初始化 LLM Serving
-    llm_serving = APIVLMServing_openai(
-        api_url="https://api.openai.com/v1/chat/completions",
-        model_name="o4-mini",      # 推荐使用强推理模型
-        max_workers=20             # 并发请求数
-    )
-
-    # 2. 构造并运行提取管道
-    pipeline = QuestionExtractPipeline(llm_serving)
-    pipeline.forward(
-        pdf_path=pipeline.test_pdf,
-        output_name="test_question_extract",
-        output_dir="./output"
-    )
-```
-
-### 关键参数说明
-
-- `api_url`：OpenAI VLM 接口地址  
-- `key_name_of_api_key`：环境变量名称  
-- `model_name`：模型名称（如 `o4-mini`，建议使用强推理模型）  
-- `max_workers`：并发请求数量
-
-
-### 算子逻辑
-
-算子的完整实现位于  
-`dataflow/operators/generate/KnowledgeCleaning/mathbook_question_extract.py`  
-下面从整体流程出发，对各关键环节做简要而不失细节的说明，便于使用和二次开发：
-
-1. PDF 文件切割  
-   - 利用 `pymupdf`（fitz）打开目标 PDF，将每一页按设定的 DPI 渲染成高质量的 JPEG 图片。  
-   - 图片按页编号保存到指定输出目录，并通过日志记录每一页的转换进度，确保可追溯性。
-
-2. 调用 MinerU 进行内容识别与图片提取  
-   - 动态导入 `mineru` 模块，若未安装则抛出友好提示，指导用户完成 `pip install mineru[pipeline]` 和模型下载。  
-   - 通过环境变量 `MINERU_MODEL_SOURCE=local` 指定从本地加载模型，支持后端选项 `"vlm-sglang-engine"` 或 `"pipeline"`。  
-   - 执行命令行工具：  
-```shell
-    mineru -p <pdf_file> -o <output_folder> -b <backend> --source local  
-```
-   - 命令执行后会在中间目录下生成 `*_content_list.json`（结构化内容清单）和原始切割出的图片文件夹。
-
-3. 整理与重命名图片资源  
-   - 读取 MinerU 产出的 `content_list.json`，筛选出所有 `type=='image'` 项。  
-   - 将对应的图片从 MinerU 的临时目录复制到最终结果文件夹，并按序重命名为 `0.jpg, 1.jpg...`。  
-   - 同时生成一份新的 JSON 清单，记录每张图片在源 PDF 中的页码及新文件路径。
-
-4. 组织模型调用指令  
-   - 从 `dataflow.prompts.kbcleaning.KnowledgeCleanerPrompt` 中获取预定义的文本提示（`mathbook_question_extract_prompt`），明确任务要求和格式规范。  
-   - 将渲染好的指令与多张输入图片（页图、插图）打包，为后续并发调用 LLM 服务做准备。
-
-5. 并发获取模型响应  
-   - 使用 `APIVLMServing_openai`（或其他 `LLMServingABC` 实现）并结合 `ThreadPoolExecutor`，将打包好的图片列表与标签并发提交给模型。  
-   - 可自定义模型名称、API 地址、并发数和超时时间，灵活满足不同性能与成本需求。
-
-6. 解析并保存最终输出  
-   - 在 `analyze_and_save` 方法中，通过正则表达式精准抓取模型返回文本内的 `<image>index.jpg</image>` 标签。  
-   - 将标签中引用的对应图片复制到结果目录的 `images/` 子文件夹。  
-   - 以两种格式输出结果：  
-     a. JSON 文件：按顺序保存各题的纯文本（已剔除标签）和对应图片路径列表  
-     b. Markdown 文件：原文中以 `![](images/xx.jpg)` 形式嵌入图片，易于可视化查看  
-   - 输出文件统一保存在用户指定的结果文件夹下，便于后续校验和二次使用。
-
-
-
-## 5 运行脚本
-
-```shell
-python generate_question_extract_api.py
-```
-
-执行完成后，`./output` 目录下将产生：
-
-- `test_question_extract.json`  
-  每条记录包含：
-  - `text`：提取到的题目文本  
-  - `pics`：题目涉及的图片路径列表  
-- `test_question_extract.md`  
-  以 Markdown 形式展示题目与配图
-
-## 6 可选扩展
-
-- 自定义提示词：若需调整系统提示，可在算子内部替换：
-  ```python
-  from dataflow.prompts.kbcleaning import KnowledgeCleanerPrompt
-  system_prompt = KnowledgeCleanerPrompt().mathbook_question_extract_prompt()
-  ```
-- 参数定制：支持切换 MinerU 后端（`pipeline` | `vlm-sglang-engine`）、调整 DPI、并发数等，详见算子 `run` 方法签名。
\ No newline at end of file

From 9edd0255b08ed55ad24714440851231d9b6c3f27 Mon Sep 17 00:00:00 2001
From: wongzhenhao <zhenhao1141@gmail.com>
Date: Mon, 2 Feb 2026 16:57:08 +0800
Subject: [PATCH 2/3] change VQAextract pipeline to quickguide

---
 docs/.vuepress/notes/en/guide.ts                               | 3 +--
 docs/.vuepress/notes/zh/guide.ts                               | 3 +--
 .../PDFVQAExtractPipeline.md => quickstart/PDFVQAExtract.md}   | 2 +-
 .../PDFVQAExtractPipeline.md => quickstart/PDFVQAExtract.md}   | 2 +-
 4 files changed, 4 insertions(+), 6 deletions(-)
 rename docs/en/notes/guide/{pipelines/PDFVQAExtractPipeline.md => quickstart/PDFVQAExtract.md} (99%)
 rename docs/zh/notes/guide/{pipelines/PDFVQAExtractPipeline.md => quickstart/PDFVQAExtract.md} (99%)

diff --git a/docs/.vuepress/notes/en/guide.ts b/docs/.vuepress/notes/en/guide.ts
index 6ea317d79..8dc26d906 100644
--- a/docs/.vuepress/notes/en/guide.ts
+++ b/docs/.vuepress/notes/en/guide.ts
@@ -42,7 +42,7 @@ export const Guide: ThemeNote = defineNoteConfig({
                 'conversation_synthesis',
                 'reasoning_general',
                 'prompted_vqa',
-                'mathquestion_extract',
+                'PDFVQAExtract',
                 'knowledge_cleaning',
                 'speech_transcription',
             ],
@@ -83,7 +83,6 @@ export const Guide: ThemeNote = defineNoteConfig({
                 "KnowledgeBaseCleaningPipeline",
                 "FuncCallPipeline",
                 "Pdf2ModelPipeline",
-                "PDFVQAExtractPipeline",
             ]
         },
         {
diff --git a/docs/.vuepress/notes/zh/guide.ts b/docs/.vuepress/notes/zh/guide.ts
index f784356f8..942a615d0 100644
--- a/docs/.vuepress/notes/zh/guide.ts
+++ b/docs/.vuepress/notes/zh/guide.ts
@@ -50,7 +50,7 @@ export const Guide: ThemeNote = defineNoteConfig({
                 'conversation_synthesis',
                 "reasoning_general",
                 "prompted_vqa",
-                "mathquestion_extract",
+                "PDFVQAExtract",
                 'knowledge_cleaning',
                 'speech_transcription',
             ],
@@ -81,7 +81,6 @@ export const Guide: ThemeNote = defineNoteConfig({
                 "KnowledgeBaseCleaningPipeline",
                 "FuncCallPipeline",
                 "Pdf2ModelPipeline",
-                "PDFVQAExtractPipeline",
             ]
         },
         {
diff --git a/docs/en/notes/guide/pipelines/PDFVQAExtractPipeline.md b/docs/en/notes/guide/quickstart/PDFVQAExtract.md
similarity index 99%
rename from docs/en/notes/guide/pipelines/PDFVQAExtractPipeline.md
rename to docs/en/notes/guide/quickstart/PDFVQAExtract.md
index b6c4a1add..cedfea0f7 100644
--- a/docs/en/notes/guide/pipelines/PDFVQAExtractPipeline.md
+++ b/docs/en/notes/guide/quickstart/PDFVQAExtract.md
@@ -1,5 +1,5 @@
 ---
-title: PDF VQA Extraction Pipeline
+title: Case 6. PDF VQA Extraction Pipeline
 createTime: 2025/11/17 14:01:55
 permalink: /en/guide/vqa_extract_optimized/
 icon: heroicons:document-text
diff --git a/docs/zh/notes/guide/pipelines/PDFVQAExtractPipeline.md b/docs/zh/notes/guide/quickstart/PDFVQAExtract.md
similarity index 99%
rename from docs/zh/notes/guide/pipelines/PDFVQAExtractPipeline.md
rename to docs/zh/notes/guide/quickstart/PDFVQAExtract.md
index 774b4207b..d9d8ade80 100644
--- a/docs/zh/notes/guide/pipelines/PDFVQAExtractPipeline.md
+++ b/docs/zh/notes/guide/quickstart/PDFVQAExtract.md
@@ -1,5 +1,5 @@
 ---
-title: PDF中的VQA提取流水线
+title: 案例6. PDF中的VQA提取流水线
 createTime: 2025/11/17 14:01:55
 permalink: /zh/guide/vqa_extract_optimized/
 icon: heroicons:document-text

From 00f886df1b647d2c3418bb2c8b80741ba0ff784d Mon Sep 17 00:00:00 2001
From: wongzhenhao <zhenhao1141@gmail.com>
Date: Mon, 2 Feb 2026 16:57:25 +0800
Subject: [PATCH 3/3] renew case numbering

---
 docs/en/notes/guide/quickstart/speech_transcription.md | 2 +-
 docs/zh/notes/guide/quickstart/speech_transcription.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/en/notes/guide/quickstart/speech_transcription.md b/docs/en/notes/guide/quickstart/speech_transcription.md
index 2a584e96a..8d009486f 100644
--- a/docs/en/notes/guide/quickstart/speech_transcription.md
+++ b/docs/en/notes/guide/quickstart/speech_transcription.md
@@ -1,5 +1,5 @@
 ---
-title: Case 9. Speech transcription
+title: Case 8. Speech transcription
 createTime: 2025/08/22 16:38:49
 permalink: /en/guide/5pdipkiv/
 icon: fad:headphones
diff --git a/docs/zh/notes/guide/quickstart/speech_transcription.md b/docs/zh/notes/guide/quickstart/speech_transcription.md
index 84a093869..0fcc481b6 100644
--- a/docs/zh/notes/guide/quickstart/speech_transcription.md
+++ b/docs/zh/notes/guide/quickstart/speech_transcription.md
@@ -1,5 +1,5 @@
 ---
-title: 案例9. 语音转文字
+title: 案例8. 语音转文字
 createTime: 2025/08/22 16:37:30
 permalink: /zh/guide/du2akut8/
 icon: fad:headphones