Skip to content

bartalor/DocMind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DocMind

A RAG-based AI assistant that ingests documents, embeds and indexes them, and answers questions grounded in the uploaded content. Uses a multi-agent architecture with autonomous tool use — the model decides when to search or list sources.

Architecture

Upload → Parse → Chunk → Embed → FAISS Vector Store
                                        ↓
                        Query → Agent (tool loop) → Response

The agent runs a tool-use loop: it receives a query, decides whether to call search_documents or list_sources, executes the tool, feeds results back, and repeats until it has enough context to answer.

Project Structure

backend/app/
├── main.py                 # FastAPI app, lifespan, singleton init
├── config.py               # Pydantic Settings (env-based config)
├── deps.py                 # Typed accessors over app.state for DI
├── models.py               # Shared Pydantic models (Chunk)
├── api/
│   ├── documents.py        # POST /documents/upload
│   └── chat.py             # POST /chat/
├── ingestion/
│   ├── parser.py           # PDF, DOCX, TXT parsing
│   └── chunker.py          # Fixed-size text chunking
├── embeddings/
│   └── embedder.py         # Sentence-Transformers wrapper
├── vectorstore/
│   └── faiss_store.py      # FAISS index + chunk metadata
├── rag/
│   ├── retriever.py        # Query → relevant chunks
│   └── generator.py        # Chunks + query → Claude API → answer
└── agents/
    └── docmind_agent.py    # Agentic tool-use loop

Tech Stack

  • Python 3.11+
  • FastAPI — async web framework
  • FAISS — vector similarity search
  • Sentence-Transformers — document embedding (all-MiniLM-L6-v2)
  • Claude API (Anthropic) — LLM for generation and agent reasoning
  • pdfplumber / python-docx — document parsing

Setup

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Create a .env file in the project root:

ANTHROPIC_API_KEY=your-key-here

Run

uvicorn backend.app.main:app --reload

API at http://localhost:8000. FastAPI auto-generates interactive API docs (Swagger UI) at /docs — you can test all endpoints from the browser.

Tests

pytest

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages