Skip to content

Conversation

@brainless
Copy link
Owner

Summary

  • Add command-restricted bash executor to manager-tools for enhanced security
  • Implement TesseractAgent for OCR text extraction from images and PDFs
  • Expose TesseractAgent in nocodo-api with GUI integration

Key Features

Command-Restricted Bash Executor

  • Extends BashExecutor to accept custom BashPermissions for granular command control
  • Provides helper methods for common patterns: only_allow(), read_only(), minimal()
  • Enables per-agent bash permission policies while maintaining backward compatibility

TesseractAgent

  • Specialized AI agent for extracting text from images using Tesseract OCR
  • Restricted bash access (only tesseract command allowed)
  • Supports multiple image formats: PNG, JPG, PDF, TIFF
  • LLM-powered post-processing for OCR error correction and text formatting
  • Pre-condition verification for tesseract installation
  • Configurable with single image file path

API & GUI Integration

  • New /agents/tesseract/execute endpoint in nocodo-api
  • TypeScript type generation for TesseractAgentConfig
  • GUI form with Image Path input field
  • Automatic directory extraction from image path

Test plan

  • Verify command-restricted bash executor only allows whitelisted commands
  • Test TesseractAgent with various image formats (PNG, JPG, PDF, TIFF)
  • Confirm OCR text extraction and LLM post-processing works correctly
  • Test API endpoint /agents/tesseract/execute with valid image paths
  • Verify GUI form accepts image paths and displays results
  • Ensure backward compatibility with existing agent functionality

🤖 Generated with Claude Code

brainless and others added 3 commits January 9, 2026 15:38
- Extend BashExecutor to accept custom BashPermissions
- Add helper methods for common permission patterns (only_allow, read_only, minimal)
- Implement builder pattern for ToolExecutor with custom bash configuration
- Export new public APIs for creating restricted executors
- Add comprehensive documentation and examples
- Maintain backward compatibility with existing API
- Enable per-agent bash permission policies

This allows creating specialized agents (e.g., TesseractAgent) that can only
execute specific whitelisted commands while maintaining full security.
Implement specialized AI agent for extracting text from images using Tesseract OCR. The agent features restricted bash access (tesseract command only), LLM-powered text cleaning and formatting, and comprehensive pre-condition verification.

Key features:
- Restricted bash executor allows only tesseract command
- Supports multiple image formats (PNG, JPG, PDF, TIFF)
- LLM-based post-processing for OCR error correction
- Pre-condition checks for tesseract installation
- Configurable working directory for file operations
- Factory methods for easy instantiation

Also includes minor formatting improvements in manager-tools SQLite module.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Configure TesseractAgent to work with individual image files instead of a base directory path. Add API endpoint, TypeScript type generation, and GUI form support.

Changes:
- Update TesseractAgent to accept image_path instead of base_path
- Add image validation and automatic directory extraction
- Update system prompt to reference specific image file
- Add TesseractAgent to API helpers and execution handlers
- Register /agents/tesseract/execute endpoint in main.rs
- Generate TypeScript types including TesseractAgentConfig
- Add Image Path input field to Home page form
- Mark task as complete

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@github-actions
Copy link

📊 Code Complexity Analysis

  • Lines added: 3037
  • Lines removed: 8
  • Net change: 3029

💡 Suggestion: This is a large PR with 3037 added lines. Consider:

  • Breaking it into smaller, focused PRs
  • Adding comprehensive tests for the new functionality
  • Updating documentation as needed

Automated analysis by GitHub Actions

@github-actions
Copy link

🤖 Automated Code Review Summary

This automated review was generated to help ensure code quality and security standards.

Rust Code Analysis

  • ⚠️ Code formatting: Some Rust files are not formatted according to rustfmt standards.

    • Run cargo fmt to fix formatting issues.
  • ⚠️ Linting: Clippy found potential issues in Rust code.

    • Run cargo clippy --workspace --all-targets -- --deny warnings to see detailed warnings.

TypeScript/JavaScript Code Analysis

Security Analysis

  • ⚠️ Potential secrets: Found references to passwords, secrets, or tokens.

    • Please verify no hardcoded credentials are present.
  • ℹ️ Debug output: Found debug print statements.

    • Consider removing or replacing with proper logging.

Recommendations

  • Run the full CI pipeline to ensure all tests pass
  • Consider adding tests for any new functionality
  • Update documentation if API changes are involved
  • Follow the development workflow described in CLAUDE.md

This review was automatically generated. Please address any issues before merging.

@brainless brainless merged commit ad8b962 into main Jan 10, 2026
4 checks passed
@brainless brainless deleted the pdf-image-to-text-extraction-tool-agent branch January 10, 2026 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant