Skip to content

Latest commit

ย 

History

History
95 lines (62 loc) ยท 2.48 KB

File metadata and controls

95 lines (62 loc) ยท 2.48 KB

English | ไธญๆ–‡ | ๆ—ฅ่ฏญ

Docr ๐Ÿš€

1. Overview ๐ŸŒŸ

๐Ÿ› ๏ธ Component design with module-based functionality, allowing for on-demand feature acquisition, ๐Ÿš€ easy to expand, and flexible to use, just like playing with building blocks!

Docr is a modular component-based toolkit for document analysis and processing. It's designed with flexibility and extensibility in mind, making it easy to expand and use various document processing functionalities as needed.

2. Features ๐Ÿ› ๏ธ

  • ๐Ÿ“„ Layout Analysis
  • ๐Ÿ”ข Formula Detection and Recognition
  • ๐Ÿ“ Optical Character Recognition (OCR)
  • ๐Ÿ“Š Table Structure Recognition
  • ๐Ÿ“š Reading Order Analysis
  • ๐Ÿ–ผ๏ธ Image Processing Utilities

3. Installation and Usage ๐Ÿ“ฆ

3.1 Prerequisites

  • Python 3.10 or higher
  • Poetry (for dependency management)

3.2 Setup

  1. Clone the repository:

    git clone https://github.com/yjmm10/docr.git
    cd docr
    git clone https://huggingface.co/liferecords/Telos.git docr/models
  2. Install dependencies:

    poetry install -v

3.3 Usage

Here's a quick example of how to use Docr for OCR:

from docr import OCR
import cv2

# Initialize the OCR model
ocr_model = OCR()

# Read an image
image = cv2.imread("path/to/your/image.png")

# Perform OCR
result = ocr_model(image)

print(result)

Docr comes with a Streamlit-based web UI for easy demonstration of its capabilities:

  1. Run the demo:

    streamlit run webui/demo.py
  2. Open your browser and navigate to the provided URL (usually http://localhost:8501)

  3. Upload an image and select the model you want to use for processing

Docr also provides a FastAPI-based API service for integration into other applications:

  1. Start the API server:

    uvicorn api.docr_api:app --host 0.0.0.0 --port 8000
  2. The API documentation will be available at http://localhost:8000/docs

4. Development ๐Ÿ”ฌ

For detailed information on development, please refer to the development guide. This guide will help you set up your IDE for working with Docr, including SRC Layout configuration.

5. Contributing ๐Ÿค

We welcome contributions! Please see our Contributing Guidelines for more details.

6. License ๐Ÿ“„

Docr is released under the MIT License. See the LICENSE file for more details.

7. Contact ๐Ÿ“ง

For any questions or feedback, please contact the project maintainer: liferecords yjmm10@yeah.net