[Feature Request] Integrate `microsoft/markitdown` as lightweight document reader with settings toggle for built-in tools

## Feature Request: Integrate MarkItDown as a lightweight document reading option for built-in tools

### Problem
Currently, GoClaw agents read documents (PDF, DOCX, XLSX, PPTX, etc.) using custom Python scripts in each skill's `scripts/` directory. While this works, it has several drawbacks:

- **High token consumption**: Each skill's extraction script may produce verbose output, consuming more LLM tokens than necessary
- **Duplicated effort**: Multiple skills (docx, pdf, xlsx, pptx) each maintain their own extraction logic
- **Maintenance burden**: Each skill script needs its own dependency management and updates
- **No unified toggle**: Users can't easily switch between extraction methods or disable document reading to save tokens

### Proposed Solution
Integrate [Microsoft MarkItDown](https://github.com/microsoft/markitdown) as an optional, unified document-to-Markdown conversion engine for GoClaw's built-in tools.

### What is MarkItDown?
- **Lightweight Python utility** from Microsoft's AutoGen team for converting files to Markdown
- **Supports**: PDF, PowerPoint, Word, Excel, Images (EXIF + OCR), Audio (metadata + transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPUBs, and more
- **Token-efficient output**: Converts to clean Markdown that LLMs understand natively
- **MCP server available**: Already has a `markitdown-mcp` package for LLM integration
- **Plugin system**: Supports 3rd-party plugins (e.g., `markitdown-ocr` for image OCR in documents)
- **MIT License**: Permissive, suitable for integration
- **CLI + Python API**: `markitdown file.pdf > output.md` or `MarkItDown().convert("file.pdf")`

### Integration Design

#### 1. Toggle in Settings
Add a settings toggle to enable/disable MarkItDown as the document reader:

```json
{
  "builtInTools": {
    "documentReader": {
      "engine": "markitdown",  // "markitdown" | "skill-scripts" | "auto"
      "enabled": true,
      "markitdownOptions": {
        "enablePlugins": false,
        "featureGroups": ["pdf", "docx", "xlsx", "pptx"],
        "llmClient": null,  // Optional: for image descriptions / OCR
        "llmModel": null
      }
    }
  }
}
```

- `"markitdown"`: Use MarkItDown for all supported file types
- `"skill-scripts"`: Use existing per-skill Python scripts (current behavior)
- `"auto"`: Use MarkItDown if available, fall back to skill scripts

#### 2. System Dependency
Add `markitdown` as a recognized Python package in the dependency installer:

```
pip:markitdown[all]  # or selective: pip:markitdown[pdf,docx,xlsx,pptx]
```

#### 3. Fallback Chain
```
User uploads document
  → MarkItDown enabled? → Yes → Convert to Markdown → Return
  → No → Fall back to skill scripts (docx/pdf/xlsx/pptx skills)
  → Skill scripts fail? → Return error
```

### Benefits

| Aspect | Current (Skill Scripts) | With MarkItDown |
|--------|------------------------|-----------------|
| Token usage | Varies by skill, often verbose | Optimized Markdown, token-efficient |
| Maintenance | Per-skill scripts to maintain | Single unified library |
| Format support | Limited to what skills implement | 12+ formats out of the box |
| Toggle | No global toggle | Settings toggle on/off |
| LLM image desc | Not supported | Built-in with llm_client |
| Plugin extensibility | Custom per skill | Standard plugin system |

### Use Cases
- **Token savings**: Users on tight context windows can use MarkItDown's leaner output
- **Quick document preview**: Convert any supported file to Markdown without loading multiple skills
- **Unified pipeline**: One tool handles PDF, DOCX, XLSX, PPTX, images, audio, etc.
- **Disable when not needed**: Toggle off to skip document reading entirely and save processing time

### Implementation Notes
- MarkItDown reads from **file streams**, not file paths — no temporary files created
- Can be installed selectively: `pip install 'markitdown[pdf,docx]'` instead of `[all]`
- MCP server already exists (`markitdown-mcp`) — could be used directly or as reference
- Plugin system supports OCR via `markitdown-ocr` for extracting text from images in documents
- Docker support available for sandboxed execution

### Comparison with Existing Approach
We already have a [comparison report](link-to-existing-report) between LiteParse and MarkItDown — both scored similarly (8/10 vs 8.5/10). This issue is specifically about **using MarkItDown as a built-in tool option** with a settings toggle, not replacing existing skill scripts entirely.

---

**Labels:** enhancement, document-processing, token-optimization, built-in-tools


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Integrate `microsoft/markitdown` as lightweight document reader with settings toggle for built-in tools #907

Feature Request: Integrate MarkItDown as a lightweight document reading option for built-in tools

Problem

Proposed Solution

What is MarkItDown?

Integration Design

1. Toggle in Settings

2. System Dependency

3. Fallback Chain

Benefits

Use Cases

Implementation Notes

Comparison with Existing Approach

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Aspect	Current (Skill Scripts)	With MarkItDown
Token usage	Varies by skill, often verbose	Optimized Markdown, token-efficient
Maintenance	Per-skill scripts to maintain	Single unified library
Format support	Limited to what skills implement	12+ formats out of the box
Toggle	No global toggle	Settings toggle on/off
LLM image desc	Not supported	Built-in with llm_client
Plugin extensibility	Custom per skill	Standard plugin system

[Feature Request] Integrate microsoft/markitdown as lightweight document reader with settings toggle for built-in tools #907

Description

Feature Request: Integrate MarkItDown as a lightweight document reading option for built-in tools

Problem

Proposed Solution

What is MarkItDown?

Integration Design

1. Toggle in Settings

2. System Dependency

3. Fallback Chain

Benefits

Use Cases

Implementation Notes

Comparison with Existing Approach

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Feature Request] Integrate `microsoft/markitdown` as lightweight document reader with settings toggle for built-in tools #907