chatbot_project/ │ ├── app/ │ ├── main.py │ ├── pages/ │ │ ├── chat.py │ │ ├── document_upload.py │ │ └── search.py │ └── components/ │ ├── chatbot.py │ └── document_viewer.py │ ├── backend/ │ ├── database/ │ │ ├── mongodb_client.py │ │ └── models.py │ ├── document_processing/ │ │ ├── pdf_processor.py │ │ ├── jsonl_processor.py │ │ └── batch_processor.py │ ├── search/ │ │ ├── semantic_search.py │ │ └── filtering.py │ ├── ai_models/ │ │ ├── response_generator.py │ │ ├── function_caller.py │ │ └── model_loader.py │ └── utils/ │ ├── text_splitter.py │ └── metadata_extractor.py │ ├── config/ │ ├── app_config.py │ └── logging_config.py │ ├── tests/ │ ├── test_document_processing.py │ ├── test_search.py │ ├── test_database.py │ └── test_ai_models.py │ ├── requirements.txt └── README.md

File Contents Overview

app/

main.py

Set up the main Streamlit application
Configure page layout and navigation
Import and render other pages/components

pages/chat.py

Implement the chat interface
Handle user input and display bot responses
Integrate with backend chatbot component

pages/document_upload.py

Create file upload interface for PDFs and JSONL files
Trigger document processing on upload
Display upload status and processing results

pages/search.py

Implement search interface with filters
Connect to backend search functionality
Display search results

components/chatbot.py

Create a reusable Streamlit component for the chatbot interface
Handle chat history and message display

components/document_viewer.py

Implement a component to view processed documents
Display document metadata and content snippets

backend/

database/mongodb_client.py

Set up MongoDB connection
Implement CRUD operations for documents and metadata

database/models.py

Define data models for documents, metadata, and chat history

document_processing/pdf_processor.py

Implement PDF parsing and text extraction
Handle PDF-specific metadata extraction

document_processing/jsonl_processor.py

Implement JSONL parsing and data extraction
Handle JSONL-specific metadata extraction

document_processing/batch_processor.py

Implement batch processing logic for large documents
Manage processing queue and status updates

search/semantic_search.py

Implement semantic search functionality
Integrate with a vector database or similarity search library

search/filtering.py

Implement metadata-based filtering
Combine filtering with semantic search results

utils/text_splitter.py

Implement text splitting algorithms
Handle different splitting strategies (by sentence, paragraph, etc.)

utils/metadata_extractor.py

Implement generic metadata extraction
Handle different metadata fields and formats

ai_models/response_generator.py

Implement interface with language model API or local model
Handle context management for coherent conversations
Generate responses based on user input and relevant document content

ai_models/function_caller.py

Define a set of functions that the AI can call
Implement logic for parsing AI function calls and executing appropriate actions
Handle error cases and unexpected inputs in function calls

ai_models/model_loader.py

Manage initialization and loading of AI models
Implement model switching logic if multiple models are used
Handle model-specific configurations and optimizations

config/

app_config.py

Store application-wide configuration
Include settings for Streamlit, MongoDB, and processing options

logging_config.py

Configure logging for the application
Set up different log levels and outputs

tests/

test_document_processing.py

Unit tests for PDF and JSONL processors
Test batch processing functionality

test_search.py

Unit tests for semantic search and filtering
Test search result accuracy and performance

test_database.py

Unit tests for MongoDB operations
Test data model integrity and CRUD operations

requirements.txt

List all Python package dependencies with versions

README.md

Provide project overview and setup instructions
Include usage guidelines and contribution information

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File Contents Overview

app/

main.py

pages/chat.py

pages/document_upload.py

pages/search.py

components/chatbot.py

components/document_viewer.py

backend/

database/mongodb_client.py

database/models.py

document_processing/pdf_processor.py

document_processing/jsonl_processor.py

document_processing/batch_processor.py

search/semantic_search.py

search/filtering.py

utils/text_splitter.py

utils/metadata_extractor.py

ai_models/response_generator.py

ai_models/function_caller.py

ai_models/model_loader.py

config/

app_config.py

logging_config.py

tests/

test_document_processing.py

test_search.py

test_database.py

requirements.txt

README.md

FilesExpand file tree

example_directory_structure.md

Latest commit

History

example_directory_structure.md

File metadata and controls

File Contents Overview

app/

main.py

pages/chat.py

pages/document_upload.py

pages/search.py

components/chatbot.py

components/document_viewer.py

backend/

database/mongodb_client.py

database/models.py

document_processing/pdf_processor.py

document_processing/jsonl_processor.py

document_processing/batch_processor.py

search/semantic_search.py

search/filtering.py

utils/text_splitter.py

utils/metadata_extractor.py

ai_models/response_generator.py

ai_models/function_caller.py

ai_models/model_loader.py

config/

app_config.py

logging_config.py

tests/

test_document_processing.py

test_search.py

test_database.py

requirements.txt

README.md