Description
To improve the onboarding experience for new contributors, I propose adding a brief high-level architecture overview and clarifying environment-related inconsistencies in the README.
Proposed Changes
1. High-Level Architecture Overview
The current README focuses heavily on setup steps but does not provide a clear “big picture” of how the system components work together. I propose adding a short Architecture Overview section near the top of the README:
Architecture Overview
The KnowledgeSpace AI Agent is a Retrieval-Augmented Generation (RAG) system composed of:
-
React Frontend
- Provides a chat-based interface for dataset discovery.
-
FastAPI Backend
- Coordinates LLM reasoning (Gemini), keyword-based search (Elasticsearch), and semantic search (Vertex AI Matching Engine).
-
Data Pipeline
- Scrapes and normalizes neuroscience metadata, stores structured records in BigQuery, and indexes embeddings in Vertex AI.
2. Clarifying Port Configuration
The README currently mentions different ports for local development and Docker-based deployment without explanation. I propose adding a short note in the Running the Application section to avoid confusion:
⚠️ Note on Port Configuration
- Local development: React development server runs on port 5000
- Docker / Nginx deployment: Containerized frontend is exposed on port 3000
3. Architecture Flowchart
To further improve clarity, I propose adding a simple, high-level flowchart to visually explain the request and data flow across the system. This helps new contributors quickly understand how user queries are processed end-to-end.
Description
To improve the onboarding experience for new contributors, I propose adding a brief high-level architecture overview and clarifying environment-related inconsistencies in the README.
Proposed Changes
1. High-Level Architecture Overview
The current README focuses heavily on setup steps but does not provide a clear “big picture” of how the system components work together. I propose adding a short Architecture Overview section near the top of the README:
Architecture Overview
The KnowledgeSpace AI Agent is a Retrieval-Augmented Generation (RAG) system composed of:
React Frontend
FastAPI Backend
Data Pipeline
2. Clarifying Port Configuration
The README currently mentions different ports for local development and Docker-based deployment without explanation. I propose adding a short note in the Running the Application section to avoid confusion:
3. Architecture Flowchart
To further improve clarity, I propose adding a simple, high-level flowchart to visually explain the request and data flow across the system. This helps new contributors quickly understand how user queries are processed end-to-end.