docs: README add SAAS apply (MigoXLab#345)

e06084 · actions-user · web-flow · commit 3bcc9fa444a1 · 2026-02-13T14:35:16.000+08:00
* docs: README add SAAS apply

* x

* 📚 Auto-update metrics documentation

---------

Co-authored-by: GitHub Action &lt;action@github.com&gt;
diff --git a/README.md b/README.md
@@ -59,6 +59,27 @@
 
 **Dingo is A Comprehensive AI Data, Model and Application Quality Evaluation Tool**, designed for ML practitioners, data engineers, and AI researchers. It helps you systematically assess and improve the quality of training data, fine-tuning datasets, and production AI systems.
 
+---
+
+## 🚀 Enterprise Dingo SaaS Version
+
+Need a **production-grade data quality platform**? Try [Dingo SaaS](https://github.com/MigoXLab/dingo-saas) Enterprise Edition!
+
+### ✨ Compared to the open-source version, SaaS provides:
+
+- 🌐 **Web UI** - Visual evaluation interface, no coding required
+- 🔐 **Access Control** - JWT + Google OAuth 2.0
+- 📊 **Visual Reports** - Interactive charts, trend analysis, export features
+- 🔌 **RESTful API** - Seamless integration with existing systems
+
+### 📝 How to Get Free SaaS Code
+
+👉 **[Apply for Dingo SaaS Repository Access](https://aicarrier.feishu.cn/share/base/form/shrcn9RqYttByQ5H1np6Yrnmhuf)** 
+
+Review time: 1-5 business days | Suitable for enterprise data governance, team collaboration
+
+---
+
 ## Why Dingo?
 
 🎯 **Production-Grade Quality Checks** - From pre-training datasets to RAG systems, ensure your AI gets high-quality data
diff --git a/README_ja.md b/README_ja.md
@@ -58,6 +58,27 @@
 
 **Dingo は包括的な AI データ、モデル、アプリケーション品質評価ツール**であり、機械学習エンジニア、データエンジニア、AI 研究者向けに設計されています。トレーニングデータ、ファインチューニングデータセット、本番 AI システムの品質を体系的に評価・改善するのを支援します。
 
+---
+
+## 🚀 エンタープライズ SaaS 版
+
+**本番グレードのデータ品質プラットフォーム**が必要ですか？[Dingo SaaS](https://github.com/MigoXLab/dingo-saas) エンタープライズ版をお試しください！
+
+### ✨ オープンソース版と比較して、SaaS 版が提供する機能：
+
+- 🌐 **Web UI** - ビジュアル評価インターフェース、コーディング不要
+- 🔐 **アクセス制御** - JWT + Google OAuth 2.0
+- 📊 **ビジュアルレポート** - インタラクティブなチャート、トレンド分析、エクスポート機能
+- 🔌 **RESTful API** - 既存システムとのシームレスな統合
+
+### 📝 無料 SaaS コードの入手方法
+
+👉 **[Dingo SaaS リポジトリアクセスを申請する](https://aicarrier.feishu.cn/share/base/form/shrcn9RqYttByQ5H1np6Yrnmhuf)** 
+
+審査時間：1-5営業日 | エンタープライズデータガバナンス、チームコラボレーションに最適
+
+---
+
 ## なぜ Dingo を選ぶのか？
 
 🎯 **本番グレードの品質チェック** - 事前学習データセットから RAG システムまで、AI に高品質なデータを提供
diff --git a/README_zh-CN.md b/README_zh-CN.md
@@ -58,6 +58,27 @@
 
 **Dingo 是一款全面的 AI 数据、模型和应用质量评估工具**，专为机器学习工程师、数据工程师和 AI 研究人员设计。它帮助你系统化地评估和改进训练数据、微调数据集和生产AI系统的质量。
 
+---
+
+## 🚀 企业级 Dingo SaaS 版本
+
+需要 **生产级数据质量平台** 吗？试试 [Dingo SaaS](https://github.com/MigoXLab/dingo-saas) 企业版！
+
+### ✨ 相比开源版，SaaS 版提供：
+
+- 🌐 **Web UI** - 可视化评估界面，无需写代码
+- 🔐 **权限管理** - JWT + Google OAuth 2.0
+- 📊 **可视化报告** - 交互式图表、趋势分析、导出功能
+- 🔌 **RESTful API** - 与现有系统无缝集成
+
+### 📝 如何获得免费 SaaS 代码
+
+👉 **[点击申请 Dingo SaaS 代码仓库访问权限](https://aicarrier.feishu.cn/share/base/form/shrcn9RqYttByQ5H1np6Yrnmhuf)** 
+
+审核时间：1-5 个工作日 | 适合企业数据治理、团队协作
+
+---
+
 ## 为什么选择 Dingo?
 
 🎯 **生产级质量检查** - 从预训练数据集到 RAG 系统，确保你的 AI 获得高质量数据
diff --git a/dingo/model/llm/llm_scout.py b/dingo/model/llm/llm_scout.py
@@ -281,7 +281,7 @@ def _clean_response(response: str) -> str:
         start = response.find('{')
         end = response.rfind('}')
         if start != -1 and end != -1:
-            response = response[start:end+1]
+            response = response[start:end + 1]
 
         return response.strip()
 
diff --git a/docs/metrics.md b/docs/metrics.md
@@ -51,14 +51,13 @@ This document provides comprehensive information about all quality metrics used
 |------|--------|-------------|--------------|-------------------|----------|
 | `LLMClassifyQR` | LLMClassifyQR | Identifies images as CAPTCHA, QR code, or normal images | Internal Implementation | N/A | N/A |
 | `VLMOCRUnderstanding` | VLMOCRUnderstanding | 评估多模态模型对图片中文字内容的识别和理解能力，使用DeepSeek-OCR作为Ground Truth | [DeepSeek-OCR: Contexts Optical Compression](https://github.com/deepseek-ai/DeepSeek-OCR) | [📊 See Results](通过对比VLM输出与OCR ground truth，识别文字遗漏、错误、幻觉等问题) | N/A |
-| `VLMRenderJudge` | VLMRenderJudge | VLM-based OCR quality evaluation through visual render-compare | Internal Implementation | N/A | N/A |
 
 ### Rule-Based TEXT Quality Metrics
 
 | Type | Metric | Description | Paper Source | Evaluation Results | Examples |
 |------|--------|-------------|--------------|-------------------|----------|
 | `QUALITY_BAD_COMPLETENESS` | RuleLineEndWithEllipsis, RuleLineEndWithTerminal, RuleSentenceNumber, RuleWordNumber | Checks whether the ratio of lines ending with ellipsis is below threshold; Checks whether the ratio of lines ending w... | [RedPajama: an Open Dataset for Training Large Language Models](https://github.com/togethercomputer/RedPajama-Data) (Together Computer, 2023) | [📊 See Results](eval/rule/slimpajama_data_evaluated_by_rule.md) | N/A |
-| `QUALITY_BAD_EFFECTIVENESS` | RuleDoi, RuleIsbn, RuleAbnormalChar, RuleAbnormalHtml, RuleAlphaWords, RuleAudioDataFormat, RuleCharNumber, RuleColonEnd, RuleContentNull, RuleContentShort, RuleContentShortMultiLan, RuleEnterAndSpace, RuleEnterMore, RuleEnterRatioMore, RuleHtmlEntity, RuleHtmlTag, RuleInvisibleChar, RuleImageDataFormat, RuleLatexSpecialChar, RuleLineJavascriptCount, RuleLoremIpsum, RuleMeanWordLength, RuleNlpDataFormat, RuleSftDataFormat, RuleSpaceMore, RuleSpecialCharacter, RuleStopWord, RuleSymbolWordRatio, RuleVedioDataFormat, RuleOnlyUrl | Check whether the string is in the correct format of the doi; Check whether the string is in the correct format of th... | Internal Implementation | N/A | N/A |
+| `QUALITY_BAD_EFFECTIVENESS` | RuleAbnormalChar, RuleAbnormalHtml, RuleAlphaWords, RuleAudioDataFormat, RuleCharNumber, RuleColonEnd, RuleContentNull, RuleContentShort, RuleContentShortMultiLan, RuleEnterAndSpace, RuleEnterMore, RuleEnterRatioMore, RuleHtmlEntity, RuleHtmlTag, RuleInvisibleChar, RuleImageDataFormat, RuleLatexSpecialChar, RuleLineJavascriptCount, RuleLoremIpsum, RuleMeanWordLength, RuleNlpDataFormat, RuleSftDataFormat, RuleSpaceMore, RuleSpecialCharacter, RuleStopWord, RuleSymbolWordRatio, RuleVedioDataFormat, RuleOnlyUrl, RuleDoi, RuleIsbn | Detects garbled text and anti-crawling characters by combining special character and invisible character detection; D... | [RedPajama: an Open Dataset for Training Large Language Models](https://github.com/togethercomputer/RedPajama-Data) (Together Computer, 2023) | [📊 See Results](eval/rule/slimpajama_data_evaluated_by_rule.md) | N/A |
 | `QUALITY_BAD_FLUENCY` | RuleAbnormalNumber, RuleCharSplit, RuleNoPunc, RuleWordSplit, RuleWordStuck | Checks PDF content for abnormal book page or index numbers that disrupt text flow; Checks PDF content for abnormal ch... | [RedPajama: an Open Dataset for Training Large Language Models](https://github.com/togethercomputer/RedPajama-Data) (Together Computer, 2023) | [📊 See Results](eval/rule/slimpajama_data_evaluated_by_rule.md) | N/A |
 | `QUALITY_BAD_RELEVANCE` | RuleHeadWordAr, RuleHeadWordCs, RuleHeadWordHu, RuleHeadWordKo, RuleHeadWordRu, RuleHeadWordSr, RuleHeadWordTh, RuleHeadWordVi, RulePatternSearch, RuleWatermark | Checks whether Arabic content contains irrelevant tail source information; Checks whether Czech content contains irre... | [RedPajama: an Open Dataset for Training Large Language Models](https://github.com/togethercomputer/RedPajama-Data) (Together Computer, 2023) | [📊 See Results](eval/rule/slimpajama_data_evaluated_by_rule.md) | N/A |
 | `QUALITY_BAD_SECURITY` | RuleIDCard, RuleUnsafeWords, RulePIIDetection | Checks whether content contains ID card information; Checks whether content contains unsafe words; Detects Personal I... | [RedPajama: an Open Dataset for Training Large Language Models](https://github.com/togethercomputer/RedPajama-Data) (Together Computer, 2023) | [📊 See Results](eval/rule/slimpajama_data_evaluated_by_rule.md) | N/A |
@@ -83,6 +82,12 @@ This document provides comprehensive information about all quality metrics used
 | `QUALITY_BAD_EFFECTIVENESS` | RuleAudioDuration | Check whether the audio duration meets the standard | Internal Implementation | N/A | N/A |
 | `QUALITY_BAD_EFFECTIVENESS` | RuleAudioSnrQuality | Check whether the audio signal-to-noise ratio meets the standard | Internal Implementation | N/A | N/A |
 
+### Job Hunting Strategy Metrics
+
+| Type | Metric | Description | Paper Source | Evaluation Results | Examples |
+|------|--------|-------------|--------------|-------------------|----------|
+| `LLMScout` | LLMScout | Strategic job hunting analysis with industry report parsing and person-job matching | Internal Implementation | N/A | N/A |
+
 ### Meta Rater Evaluation Metrics
 
 | Type | Metric | Description | Paper Source | Evaluation Results | Examples |
@@ -107,6 +112,12 @@ This document provides comprehensive information about all quality metrics used
 | `LLMResumeOptimizer` | LLMResumeOptimizer | ATS-focused resume optimization with keyword injection and STAR polishing | Internal Implementation | N/A | N/A |
 | `LLMResumeQuality` | LLMResumeQuality | Comprehensive resume quality evaluation covering privacy, contact, format, structure, professionalism, date, and comp... | Internal Implementation | N/A | N/A |
 
+### Rule-Based Metadata Quality Metrics
+
+| Type | Metric | Description | Paper Source | Evaluation Results | Examples |
+|------|--------|-------------|--------------|-------------------|----------|
+| `QUALITY_BAD_EFFECTIVENESS` | RuleMetadataSimilarity | 检查元数据字段与基准数据的相似度匹配，阈值默认为0.6 | Internal Implementation | N/A | N/A |
+
 ### Rule-Based RESUME Quality Metrics
 
 | Type | Metric | Description | Paper Source | Evaluation Results | Examples |
@@ -131,3 +142,9 @@ This document provides comprehensive information about all quality metrics used
 |------|--------|-------------|--------------|-------------------|----------|
 | `LLMLongVideoQa` | LLMLongVideoQa | Generate video-related question-answer pairs based on the summarized information of the input long video. | [VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos](https://arxiv.org/abs/2506.108572) (Jiashuo Yu et al., 2025) | N/A | N/A |
 
+### Other Metrics
+
+| Type | Metric | Description | Paper Source | Evaluation Results | Examples |
+|------|--------|-------------|--------------|-------------------|----------|
+| `AgentFactCheck` | AgentFactCheck | Agent-based hallucination detection with autonomous web search | Internal Implementation | N/A | N/A |
+