feat: refactor CLI to Agent-Native (MigoXLab#366)

e06084 · web-flow · commit 070a23e372f0 · 2026-03-23T13:51:21.000+08:00
diff --git a/.cursor/rules/project-architecture.mdc b/.cursor/rules/project-architecture.mdc
@@ -16,5 +16,5 @@ alwaysApply: true
 4. **Evaluator contract**: Every evaluator `eval()` method must return `EvalDetail` with `metric`, `status` (bool), `label` (list), and `reason` (list).
 5. **Dependencies**: Core deps in `requirements/runtime.txt`, optional in `requirements/datasource.txt`. New optional deps → add to `setup.py` extras_require.
 6. **MCP Server**: Entry point is `mcp_server.py`, uses FastMCP + SSE. Environment variables for LLM config (`OPENAI_API_KEY`, etc.).
-7. **Three interfaces**: SDK (Python API), CLI (`python -m dingo.run.cli`), MCP Server. All share the same `InputArgs` → `Executor` pipeline.
+7. **Three interfaces**: SDK (Python API), CLI (`dingo eval`), MCP Server. All share the same `InputArgs` → `Executor` pipeline.
 8. **Testing**: `pytest test/scripts --ignore=test/scripts/data`. Integration tests via CLI with configs in `.github/env/`.
diff --git a/.github/workflows/IntegrationTest.yml b/.github/workflows/IntegrationTest.yml
@@ -36,32 +36,32 @@ jobs:
 
     - name: Integration Test(local plaintext)
       run: |
-        python -m dingo.run.cli --input .github/env/local_plaintext.json
-        python -m dingo.run.cli --input .github/env/local_plaintext_save.json
+        dingo eval --input .github/env/local_plaintext.json
+        dingo eval --input .github/env/local_plaintext_save.json
     - name: Integration Test(local json)
       run: |
-        python -m dingo.run.cli --input .github/env/local_json.json
+        dingo eval --input .github/env/local_json.json
     - name: Integration Test(local jsonl)
       run: |
-        python -m dingo.run.cli --input .github/env/local_jsonl.json
+        dingo eval --input .github/env/local_jsonl.json
     - name: Integration Test(local listjson)
       run: |
-        python -m dingo.run.cli --input .github/env/local_listjson.json
+        dingo eval --input .github/env/local_listjson.json
     - name: Integration Test(huggingface plaintext)
       run: |
-        python -m dingo.run.cli --input .github/env/hf_plaintext.json
+        dingo eval --input .github/env/hf_plaintext.json
     - name: Integration Test(huggingface json)
       run: |
-        python -m dingo.run.cli --input .github/env/hf_json.json
+        dingo eval --input .github/env/hf_json.json
     - name: Integration Test(huggingface jsonl)
       run: |
-        python -m dingo.run.cli --input .github/env/hf_jsonl.json
+        dingo eval --input .github/env/hf_jsonl.json
     - name: Integration Test(huggingface listjson)
       run: |
-        python -m dingo.run.cli --input .github/env/hf_listjson.json
+        dingo eval --input .github/env/hf_listjson.json
     - name: Integration Test(custom config)
       run: |
-        python -m dingo.run.cli --input .github/env/custom_config_rule.json
+        dingo eval --input .github/env/custom_config_rule.json
     - name: Run unit tests
       run: |
         pytest test/scripts --ignore=test/scripts/data
diff --git a/AGENTS.md b/AGENTS.md
@@ -34,19 +34,20 @@ dingo/
 │   ├── optional.txt         ← heavy optional deps (torch, pyspark, etc.)
 │   └── agent.txt            ← agent evaluation deps (langchain, tavily)
 │
+├── SKILL.md                 ← AI agent skill definition (symlink → clawhub/SKILL.md)
 ├── dingo/                   ← core Python package
 │   ├── config/
 │   │   └── input_args.py    ← InputArgs, EvalPiplineConfig, EvaluatorGroupConfig
 │   ├── io/
 │   │   ├── input/data.py    ← Data model (Pydantic, extra="allow")
-│   │   └── output/          ← ResultInfo, EvalDetail, SummaryModel
+│   │   └── output/          ← ResultInfo, EvalDetail, SummaryModel (+ cross-layer analysis)
 │   ├── data/
 │   │   ├── datasource/      ← LocalDataSource, SQLDataSource, S3DataSource, HFDataSource
 │   │   ├── dataset/         ← Dataset implementations per source
 │   │   └── converter/       ← Format converters (JSON, JSONL, CSV, Parquet, etc.)
 │   ├── model/
 │   │   ├── model.py         ← Model registry (rule_register, llm_register)
-│   │   ├── rule/            ← Rule-based evaluators (30+ built-in)
+│   │   ├── rule/            ← Rule-based evaluators (80+ built-in)
 │   │   │   ├── base.py      ← BaseRule
 │   │   │   ├── rule_common.py ← Common rules (text quality, format, PII, etc.)
 │   │   │   └── utils/       ← Shared utilities (normalize, ngrams, etc.)
@@ -62,10 +63,10 @@ dingo/
 │   │           ├── agent_fact_check.py
 │   │           └── agent_hallucination.py
 │   ├── exec/
-│   │   ├── local.py         ← LocalExecutor (single machine)
+│   │   ├── local.py         ← LocalExecutor (single machine, cross-layer conflict detection)
 │   │   └── spark.py         ← SparkExecutor (distributed)
 │   └── run/
-│       ├── cli.py           ← CLI entry point
+│       ├── cli.py           ← CLI entry point (subcommands: eval, info)
 │       └── vsl.py           ← GUI visualization entry point
 │
 ├── app/                     ← React frontend (Next.js-style)
@@ -191,7 +192,7 @@ pip install "dingo-python[all]"         # + Everything
 2. Create dataset class in `dingo/data/dataset/`
 3. Register in the respective `__init__.py`
 4. Use lazy imports if new dependencies required
-5. Add dependency to `requirements/datasource.txt` and `setup.py` extras
+5. Add dependency to `requirements/runtime.txt` (core) or `setup.py` extras (heavy/optional)
 
 ### Testing
 
@@ -203,9 +204,32 @@ pytest test/scripts --ignore=test/scripts/data
 pytest test/scripts/model/llm/test_rag.py -v
 
 # Integration tests (CLI)
-python -m dingo.run.cli --input test/env/local_plaintext.json
+dingo eval --input .github/env/local_plaintext.json
+dingo eval --input .github/env/local_json.json --json
 ```
 
+### CLI Reference
+
+Dingo provides a `dingo` CLI command (installed via `pip install dingo-python`):
+
+```bash
+# Run evaluation (primary command)
+dingo eval --input config.json            # Human-readable output
+dingo eval --input config.json --json     # JSON output (for agents/automation)
+
+# List available evaluators, groups
+dingo info                                # Show all (rules, LLM, groups)
+dingo info --rules                        # Rule evaluators only
+dingo info --llm                          # LLM evaluators only
+dingo info --groups                       # Rule groups only
+dingo info --json                         # JSON output
+
+# Backward compatibility (no subcommand)
+dingo --input config.json                 # Same as `dingo eval --input config.json`
+python -m dingo.run.cli --input config.json
+```
+
+
 ### Version Conventions
 
 - Version defined in `setup.py` (`version="2.0.0"`)
@@ -245,9 +269,10 @@ When these events occur, update the corresponding files:
 | Event | Update |
 |-------|--------|
 | New evaluator added | Ensure registration decorator is correct; update `docs/metrics.md` |
-| New datasource added | Update `requirements/datasource.txt`, `setup.py` extras, README install section |
-| New dependency added | Decide: `runtime.txt` (core) vs `datasource.txt` (optional); use lazy import for optional |
+| New datasource added | Update `requirements/runtime.txt`, `setup.py` extras if heavy, README install section |
+| New dependency added | Decide: `runtime.txt` (core) vs `setup.py` extras (heavy/optional); use lazy import for optional |
 | New MCP tool added | Update MCP Tools table in this file |
+| New CLI subcommand added | Update CLI Reference section in this file |
 | Directory structure change | Update this file |
 | Version bump | Update `setup.py` version field |
 
diff --git a/README.md b/README.md
@@ -188,13 +188,13 @@ if __name__ == '__main__':
 ### Evaluate with Rule Sets
 
 ```shell
-python -m dingo.run.cli --input .github/env/local_plaintext.json
+dingo eval --input .github/env/local_plaintext.json
 ```
 
 ### Evaluate with LLM (e.g., GPT-4o)
 
 ```shell
-python -m dingo.run.cli --input .github/env/local_json.json
+dingo eval --input .github/env/local_json.json
 ```
 
 ## GUI Visualization
diff --git a/README_ja.md b/README_ja.md
@@ -187,13 +187,13 @@ if __name__ == '__main__':
 ### ルールセットでの評価
 
 ```shell
-python -m dingo.run.cli --input .github/env/local_plaintext.json
+dingo eval --input .github/env/local_plaintext.json
 ```
 
 ### LLM（例：GPT-4o）での評価
 
 ```shell
-python -m dingo.run.cli --input .github/env/local_json.json
+dingo eval --input .github/env/local_json.json
 ```
 
 ## GUI可視化
diff --git a/README_zh-CN.md b/README_zh-CN.md
@@ -188,13 +188,13 @@ if __name__ == '__main__':
 ### 使用规则集评估
 
 ```shell
-python -m dingo.run.cli --input .github/env/local_plaintext.json
+dingo eval --input .github/env/local_plaintext.json
 ```
 
 ### 使用LLM评估（例如GPT-4o）
 
 ```shell
-python -m dingo.run.cli --input .github/env/local_json.json
+dingo eval --input .github/env/local_json.json
 ```
 
 ## 图形界面可视化
diff --git a/SKILL.md b/SKILL.md
@@ -0,0 +1 @@
+clawhub/SKILL.md
diff --git a/clawhub/SKILL.md b/clawhub/SKILL.md
@@ -48,7 +48,7 @@ python -c "from dingo.config import InputArgs; print('Dingo OK')"
 Dingo CLI takes a JSON config file as input:
 
 ```bash
-python -m dingo.run.cli --input config.json
+dingo eval --input config.json
 ```
 
 ### Minimal rule-based config
@@ -293,7 +293,7 @@ outputs/<timestamp>/
 When using this skill on behalf of the user:
 
 * **Always write a config file** before running CLI evaluation. Don't try to pass complex JSON inline.
-* **Quote file paths** with spaces in commands: `python -m dingo.run.cli --input "my config.json"`
+* **Quote file paths** with spaces in commands: `dingo eval --input "my config.json"`
 * **Wrap main code in `if __name__ == '__main__':`** when writing Python scripts — Dingo uses multiprocessing internally, which fails on macOS without this guard.
 * **Infer format from extension**: `.jsonl` → `jsonl`, `.json` → `json`, `.csv` → `csv`, `.txt` → `plaintext`.
 * **Default to rule-based** when the user doesn't specify evaluation type — it's free, fast, and needs no API key.
diff --git a/dingo/run/cli.py b/dingo/run/cli.py
diff --git a/docs/article_fact_checking_guide.md b/docs/article_fact_checking_guide.md
diff --git a/docs/en/CONTRIBUTING.md b/docs/en/CONTRIBUTING.md
diff --git a/setup.py b/setup.py