Add TesseractAgent for PDF/Image OCR text extraction with command-restricted bash executor #193

brainless · 2026-01-10T10:51:25Z

Summary

Add command-restricted bash executor to manager-tools for enhanced security
Implement TesseractAgent for OCR text extraction from images and PDFs
Expose TesseractAgent in nocodo-api with GUI integration

Key Features

Command-Restricted Bash Executor

Extends BashExecutor to accept custom BashPermissions for granular command control
Provides helper methods for common patterns: only_allow(), read_only(), minimal()
Enables per-agent bash permission policies while maintaining backward compatibility

TesseractAgent

Specialized AI agent for extracting text from images using Tesseract OCR
Restricted bash access (only tesseract command allowed)
Supports multiple image formats: PNG, JPG, PDF, TIFF
LLM-powered post-processing for OCR error correction and text formatting
Pre-condition verification for tesseract installation
Configurable with single image file path

API & GUI Integration

New /agents/tesseract/execute endpoint in nocodo-api
TypeScript type generation for TesseractAgentConfig
GUI form with Image Path input field
Automatic directory extraction from image path

Test plan

Verify command-restricted bash executor only allows whitelisted commands
Test TesseractAgent with various image formats (PNG, JPG, PDF, TIFF)
Confirm OCR text extraction and LLM post-processing works correctly
Test API endpoint /agents/tesseract/execute with valid image paths
Verify GUI form accepts image paths and displays results
Ensure backward compatibility with existing agent functionality

🤖 Generated with Claude Code

- Extend BashExecutor to accept custom BashPermissions - Add helper methods for common permission patterns (only_allow, read_only, minimal) - Implement builder pattern for ToolExecutor with custom bash configuration - Export new public APIs for creating restricted executors - Add comprehensive documentation and examples - Maintain backward compatibility with existing API - Enable per-agent bash permission policies This allows creating specialized agents (e.g., TesseractAgent) that can only execute specific whitelisted commands while maintaining full security.

Implement specialized AI agent for extracting text from images using Tesseract OCR. The agent features restricted bash access (tesseract command only), LLM-powered text cleaning and formatting, and comprehensive pre-condition verification. Key features: - Restricted bash executor allows only tesseract command - Supports multiple image formats (PNG, JPG, PDF, TIFF) - LLM-based post-processing for OCR error correction - Pre-condition checks for tesseract installation - Configurable working directory for file operations - Factory methods for easy instantiation Also includes minor formatting improvements in manager-tools SQLite module. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Configure TesseractAgent to work with individual image files instead of a base directory path. Add API endpoint, TypeScript type generation, and GUI form support. Changes: - Update TesseractAgent to accept image_path instead of base_path - Add image validation and automatic directory extraction - Update system prompt to reference specific image file - Add TesseractAgent to API helpers and execution handlers - Register /agents/tesseract/execute endpoint in main.rs - Generate TypeScript types including TesseractAgentConfig - Add Image Path input field to Home page form - Mark task as complete Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

github-actions · 2026-01-10T10:51:35Z

📊 Code Complexity Analysis

Lines added: 3037
Lines removed: 8
Net change: 3029

💡 Suggestion: This is a large PR with 3037 added lines. Consider:

Breaking it into smaller, focused PRs
Adding comprehensive tests for the new functionality
Updating documentation as needed

Automated analysis by GitHub Actions

github-actions · 2026-01-10T10:51:51Z

🤖 Automated Code Review Summary

This automated review was generated to help ensure code quality and security standards.

Rust Code Analysis

⚠️ Code formatting: Some Rust files are not formatted according to rustfmt standards.
- Run cargo fmt to fix formatting issues.
⚠️ Linting: Clippy found potential issues in Rust code.
- Run cargo clippy --workspace --all-targets -- --deny warnings to see detailed warnings.

TypeScript/JavaScript Code Analysis

Security Analysis

⚠️ Potential secrets: Found references to passwords, secrets, or tokens.
- Please verify no hardcoded credentials are present.
ℹ️ Debug output: Found debug print statements.
- Consider removing or replacing with proper logging.

Recommendations

Run the full CI pipeline to ensure all tests pass
Consider adding tests for any new functionality
Update documentation if API changes are involved
Follow the development workflow described in CLAUDE.md

This review was automatically generated. Please address any issues before merging.

brainless and others added 3 commits January 9, 2026 15:38

brainless merged commit ad8b962 into main Jan 10, 2026
4 checks passed

brainless deleted the pdf-image-to-text-extraction-tool-agent branch January 10, 2026 10:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TesseractAgent for PDF/Image OCR text extraction with command-restricted bash executor #193

Add TesseractAgent for PDF/Image OCR text extraction with command-restricted bash executor #193

Uh oh!

brainless commented Jan 10, 2026

Uh oh!

github-actions bot commented Jan 10, 2026

Uh oh!

github-actions bot commented Jan 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add TesseractAgent for PDF/Image OCR text extraction with command-restricted bash executor #193

Add TesseractAgent for PDF/Image OCR text extraction with command-restricted bash executor #193

Uh oh!

Conversation

brainless commented Jan 10, 2026

Summary

Key Features

Command-Restricted Bash Executor

TesseractAgent

API & GUI Integration

Test plan

Uh oh!

github-actions bot commented Jan 10, 2026

📊 Code Complexity Analysis

Uh oh!

github-actions bot commented Jan 10, 2026

🤖 Automated Code Review Summary

Rust Code Analysis

TypeScript/JavaScript Code Analysis

Security Analysis

Recommendations

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant