Skip to content

Commit 763e2b3

Browse files
juliomsilvakausmeowssannya-singal
authored
[feat] Docling tool integration with tests and cookbook example (agno-agi#6277)
## Summary Adds Docling-based document conversion tool with Markdown/Text/HTML/JSON/DocTags exports, advanced PDF/OCR options, unit tests, and cookbook (If applicable, issue number: #____) ## Type of change - [ ] Bug fix - [x] New feature - [ ] Breaking change - [ ] Improvement - [ ] Model update - [ ] Other: --- ## Checklist - [x] Code complies with style guidelines - [x] Ran format/validation scripts (`./scripts/format.sh` and `./scripts/validate.sh`) - [x] Self-review completed - [x] Documentation updated (comments, docstrings) - [x] Examples and guides: Relevant cookbook examples have been included or updated (if applicable) - [ ] Tested in clean environment - [x] Tests added/updated (if applicable) --- ## Additional Notes `./scripts/test.sh` fails locally because several optional dependencies are not installed (e.g. chonkie, pypdf, ddgs, anthropic, etc.). The Docling-specific tests pass (`pytest libs/agno/tests/unit/tools/test_docling.py`). ## Screenshots - Markdown conversion output: <img width="1706" height="662" alt="image" src="https://github.com/user-attachments/assets/30707ee2-d8b0-46b4-a6f6-6e6214e71ee9" /> - JSON conversion output: <img width="1708" height="794" alt="image" src="https://github.com/user-attachments/assets/234500ea-ab8a-4eb6-bad5-55da38ad301b" /> - DocTags conversion output: <img width="1714" height="722" alt="image" src="https://github.com/user-attachments/assets/53072422-f8eb-41ea-bfaf-a7d915dd4296" /> --------- Co-authored-by: Kaustubh <shuklakaustubh84@gmail.com> Co-authored-by: Sannya Singal <32308435+sannya-singal@users.noreply.github.com>
1 parent 91ab3b4 commit 763e2b3

9 files changed

Lines changed: 1152 additions & 1 deletion

File tree

cookbook/91_tools/TEST_LOG.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
# Test Log
22

3+
### docling_tools/run.py
4+
5+
**Status:** PASS
6+
7+
**Description:** Refactored the original single-file Docling cookbook into a modular folder (`docling_tools/`) with separate files for shared paths, basic conversion examples, and OCR examples. Added PPTX and image conversion examples using static resources (`ai_presentation.pptx` and `restaurant_invoice.png`).
8+
9+
**Result:** Syntax validation passed for `paths.py`, `basic_examples.py`, `ocr_example.py`, and `run.py` using `python -m py_compile`. Re-ran Docling unit tests with `pytest libs/agno/tests/unit/tools/test_docling.py -q` and all 24 tests passed. Full cookbook runtime execution was not performed because agent model credentials are required.
10+
11+
---
12+
313
### gitlab_tools.py
414

515
**Status:** PASS
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Docling Tools Cookbook
2+
3+
This cookbook was split into smaller files for readability:
4+
5+
- `run.py`: entrypoint that runs all examples
6+
- `paths.py`: shared test resource path helpers
7+
- `basic_examples.py`: conversion examples across formats
8+
- `ocr_example.py`: advanced OCR configuration example
9+
10+
Includes examples for:
11+
- PDF, DOCX, Markdown, HTML, XML, XLSX
12+
- PPTX (`ai_presentation.pptx`)
13+
- Image (`restaurant_invoice.png`)
14+
- Audio/Video to VTT (`agno_description.mp4`)
15+
16+
Run with:
17+
18+
```bash
19+
.venvs/demo/bin/python cookbook/91_tools/docling_tools/run.py
20+
```
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
from agno.agent import Agent
2+
from agno.tools.docling import DoclingTools
3+
4+
from paths import (
5+
audio_video_path,
6+
docx_path,
7+
html_path,
8+
image_path,
9+
md_path,
10+
pdf_path,
11+
pptx_path,
12+
xlsx_path,
13+
xml_path,
14+
)
15+
16+
17+
def run_basic_examples() -> None:
18+
agent = Agent(
19+
tools=[DoclingTools(all=True)],
20+
description="You are an agent that converts documents from all Docling parsers and exports to all supported output formats.",
21+
)
22+
23+
agent.print_response(
24+
"List supported Docling input parsers and active allowed parsers.",
25+
markdown=True,
26+
)
27+
28+
agent.print_response(
29+
f"Convert to Markdown: {pdf_path}",
30+
markdown=True,
31+
)
32+
agent.print_response(
33+
f"Convert to JSON and return the full JSON without summarizing: {pdf_path}",
34+
markdown=True,
35+
)
36+
agent.print_response(
37+
f"Convert to YAML: {pdf_path}",
38+
markdown=True,
39+
)
40+
agent.print_response(
41+
f"Convert to DocTags: {pdf_path}",
42+
markdown=True,
43+
)
44+
agent.print_response(
45+
f"Convert to VTT: {pdf_path}",
46+
markdown=True,
47+
)
48+
agent.print_response(
49+
f"Convert to HTML split page: {pdf_path}",
50+
markdown=True,
51+
)
52+
53+
# Additional parser examples based on static resources.
54+
agent.print_response(
55+
f"Convert to Markdown: {docx_path}",
56+
markdown=True,
57+
)
58+
agent.print_response(
59+
f"Convert to Markdown: {md_path}",
60+
markdown=True,
61+
)
62+
agent.print_response(
63+
f"Convert to Markdown: {html_path}",
64+
markdown=True,
65+
)
66+
agent.print_response(
67+
f"Convert to Markdown: {xml_path}",
68+
markdown=True,
69+
)
70+
agent.print_response(
71+
f"Convert to Markdown: {xlsx_path}",
72+
markdown=True,
73+
)
74+
agent.print_response(
75+
f"Convert to Markdown: {pptx_path}",
76+
markdown=True,
77+
)
78+
agent.print_response(
79+
f"Convert to Markdown: {image_path}",
80+
markdown=True,
81+
)
82+
agent.print_response(
83+
f"Convert to VTT: {audio_video_path}",
84+
markdown=True,
85+
)
86+
87+
# convert_string is limited by Docling to Markdown and HTML source content.
88+
agent.print_response(
89+
"Use convert_string_content to convert this markdown string to JSON: # Inline Markdown\n\nThis is a parser test.",
90+
markdown=True,
91+
)
92+
agent.print_response(
93+
"Use convert_string_content to convert this html string to Markdown: <h1>Inline HTML</h1><p>This is a parser test.</p>",
94+
markdown=True,
95+
)
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
from agno.agent import Agent
2+
from agno.tools.docling import DoclingTools
3+
4+
from paths import pdf_path
5+
6+
7+
def run_ocr_example() -> None:
8+
# pdf_ocr_engine accepts: auto | easyocr | tesseract | tesseract_cli | ocrmac | rapidocr
9+
# Some engines may require extra runtime dependencies in your environment.
10+
ocr_tools = DoclingTools(
11+
pdf_enable_ocr=True,
12+
pdf_ocr_engine="easyocr",
13+
pdf_ocr_lang=["pt", "en"],
14+
pdf_force_full_page_ocr=True,
15+
pdf_enable_table_structure=True,
16+
pdf_enable_picture_description=False,
17+
pdf_document_timeout=120.0,
18+
)
19+
20+
ocr_agent = Agent(
21+
tools=[ocr_tools],
22+
description="You are an agent that converts PDFs using advanced OCR.",
23+
)
24+
25+
ocr_agent.print_response(
26+
f"Convert to Markdown: {pdf_path}",
27+
markdown=True,
28+
)
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
from pathlib import Path
2+
3+
repo_root = Path(__file__).resolve().parents[3]
4+
testing_resources_path = repo_root / "cookbook/07_knowledge/testing_resources"
5+
6+
7+
def get_test_resource_path(filename: str) -> str:
8+
return str(testing_resources_path / filename)
9+
10+
11+
pdf_path = get_test_resource_path("cv_1.pdf")
12+
docx_path = get_test_resource_path("project_proposal.docx")
13+
md_path = get_test_resource_path("coffee.md")
14+
html_path = get_test_resource_path("company_info.html")
15+
xml_path = get_test_resource_path("patent_sample.xml")
16+
xlsx_path = get_test_resource_path("sample_products.xlsx")
17+
pptx_path = get_test_resource_path("ai_presentation.pptx")
18+
image_path = get_test_resource_path("restaurant_invoice.png")
19+
audio_video_path = get_test_resource_path("agno_description.mp4")
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
from basic_examples import run_basic_examples
2+
from ocr_example import run_ocr_example
3+
4+
if __name__ == "__main__":
5+
run_basic_examples()
6+
run_ocr_example()

0 commit comments

Comments
 (0)