Skip to content

Suhaib3100/structa-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

23 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Structa AI πŸ“„πŸ€–

AI-powered mobile document scanner that captures physical documents and converts them to structured digital formats.

Features

  • πŸ“· Smart Document Capture - Camera-based scanning with auto-crop and perspective correction
  • πŸ” OCR Engine - Extract text from printed and handwritten documents
  • πŸ“Š Table Detection - Automatically detect and extract tables
  • πŸ“ Multiple Export Formats - Export to PDF, Excel, CSV, JSON, Markdown
  • πŸ”„ Offline Support - Queue uploads when offline, sync when connected
  • πŸ”’ Secure - End-to-end encryption, data isolation, GDPR compliance

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Mobile App (Expo)                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ Camera  β”‚  β”‚ Image   β”‚  β”‚ Upload  β”‚  β”‚  Offline Queue      β”‚ β”‚
β”‚  β”‚ Capture β”‚β†’ β”‚ Process β”‚β†’ β”‚ Manager β”‚β†’ β”‚  (Background Sync)  β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      Backend API (Express)                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Auth   β”‚  β”‚ Upload  β”‚  β”‚  Jobs   β”‚  β”‚    Rate Limiting    β”‚ β”‚
β”‚  β”‚  JWT    β”‚  β”‚ Handler β”‚  β”‚  Queue  β”‚  β”‚    Validation       β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β–Ό                    β–Ό                    β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ Postgres β”‚        β”‚  Redis   β”‚        β”‚    MinIO     β”‚
    β”‚   DB     β”‚        β”‚  Queue   β”‚        β”‚   Storage    β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    AI Workers (Python/FastAPI)                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚ Preprocess β”‚β†’ β”‚ OCR β”‚β†’ β”‚ Layout β”‚β†’ β”‚ Tables β”‚β†’ β”‚ Structure β”‚β”‚
β”‚  β”‚   Image    β”‚  β”‚     β”‚  β”‚ Detect β”‚  β”‚ Extractβ”‚  β”‚   Data    β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Tech Stack

Mobile App

  • Expo SDK 54 - Managed workflow
  • React Native - Cross-platform UI
  • TypeScript - Type safety
  • React Navigation - Native stack navigation

Backend API

  • Express.js - HTTP server
  • Prisma - PostgreSQL ORM
  • BullMQ - Job queue
  • Jose - JWT authentication
  • Zod - Request validation

AI Workers

  • FastAPI - Python API server
  • OpenCV - Image processing
  • Tesseract/EasyOCR - Text recognition
  • LayoutParser - Document layout detection
  • img2table - Table extraction

Infrastructure

  • PostgreSQL - Primary database
  • Redis - Job queue & caching
  • MinIO - S3-compatible object storage
  • Docker - Containerization

Getting Started

Prerequisites

  • Node.js 20+
  • Python 3.11+
  • Docker & Docker Compose
  • PostgreSQL 15+
  • Redis 7+

Quick Start (Docker)

# Clone the repository
git clone https://github.com/yourusername/structa-ai.git
cd structa-ai

# Start all services
docker-compose up -d

# The services will be available at:
# - Mobile Metro: http://localhost:8081
# - Backend API: http://localhost:3000
# - AI Workers: http://localhost:8000
# - MinIO Console: http://localhost:9001

Development Setup

1. Mobile App

# Install dependencies
npm install

# Start Expo development server
npx expo start

2. Backend API

cd backend

# Install dependencies
npm install

# Setup environment
cp .env.example .env

# Generate Prisma client
npm run db:generate

# Run migrations
npm run db:migrate

# Start development server
npm run dev

3. AI Workers

cd ai-workers

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Setup environment
cp .env.example .env

# Start server
python main.py

Project Structure

structa-ai/
β”œβ”€β”€ app/                    # Mobile app screens
β”‚   β”œβ”€β”€ screens/           # Screen components
β”‚   └── components/        # Reusable UI components
β”œβ”€β”€ domain/                # Business logic
β”‚   β”œβ”€β”€ models/           # Data models
β”‚   β”œβ”€β”€ workflows/        # State machines
β”‚   └── services/         # Domain services
β”œβ”€β”€ infra/                 # Infrastructure layer
β”‚   β”œβ”€β”€ camera/           # Camera service
β”‚   β”œβ”€β”€ image/            # Image processing
β”‚   β”œβ”€β”€ upload/           # Upload management
β”‚   β”œβ”€β”€ api/              # API client
β”‚   └── network/          # Network state
β”œβ”€β”€ backend/              # Node.js backend
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ api/         # Express routes
β”‚   β”‚   β”œβ”€β”€ config/      # Configuration
β”‚   β”‚   └── services/    # Business services
β”‚   └── prisma/          # Database schema
β”œβ”€β”€ ai-workers/          # Python AI processing
β”‚   β”œβ”€β”€ processors/      # AI processors
β”‚   └── exporters/       # Export services
└── docker-compose.yml   # Docker orchestration

API Endpoints

Authentication

  • POST /api/auth/register - Create account
  • POST /api/auth/login - Login
  • POST /api/auth/logout - Logout
  • GET /api/auth/profile - Get profile

Documents

  • GET /api/documents - List documents
  • POST /api/documents - Create document
  • GET /api/documents/:id - Get document
  • DELETE /api/documents/:id - Delete document
  • GET /api/documents/:id/status - Processing status
  • POST /api/documents/:id/process - Start processing

Uploads

  • POST /api/uploads/:documentId/pages - Upload page
  • DELETE /api/uploads/:documentId/pages/:pageId - Delete page
  • PUT /api/uploads/:documentId/pages/reorder - Reorder pages

Environment Variables

Backend (.env)

DATABASE_URL=postgresql://user:pass@localhost:5432/structa
REDIS_URL=redis://localhost:6379
JWT_SECRET=your-secret-key
STORAGE_ENDPOINT=http://localhost:9000
STORAGE_BUCKET=structa-documents

AI Workers (.env)

AI_PORT=8000
AI_REDIS_URL=redis://localhost:6379
AI_OCR_ENGINE=tesseract
AI_STORAGE_TYPE=s3

Phase Completion Status

  • βœ… Phase 1: Foundation (Expo, TypeScript, Models)
  • βœ… Phase 2: Mobile Runtime (Permissions, Storage, Background)
  • βœ… Phase 3: Image Quality (Preprocessing, Multi-page)
  • βœ… Phase 4: Network & Transfer (Chunked Upload, Offline Queue)
  • βœ… Phase 5: Backend API (Express, Prisma, BullMQ)
  • βœ… Phase 6: AI Pipeline (OCR, Layout, Tables)
  • βœ… Phase 7: Data Structuring (Block Segmentation, Validation)
  • βœ… Phase 8: Export (PDF, Excel, CSV, JSON)
  • βœ… Phase 9: Security (Encryption, Data Isolation, Audit)
  • βœ… Phase 10: Scalability (Metrics, Health, Feature Flags)

License

MIT License - see LICENSE for details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Submit a pull request

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages