Code Expert – Free AI-Powered Codebase Q&A

Code Expert is a free, open-source AI assistant that lets developers input any public GitHub repository URL and instantly ask questions about its code structure, dependencies, and functionality. Powered by advanced RAG (Retrieval Augmented Generation) and Google's Gemini 2.5 Pro, it provides accurate, context-grounded answers to help you explore, understand, and onboard to complex codebases with ease.

✨ Key Features & Benefits

Real-Time Chat: Interact directly with your indexed code and get answers in seconds.
100% Free & Open Source: Free to use and modify. Self-hosting or local setup requires your own API keys for Supabase and Google Gemini (see Environment Variables).
Dual RAG Engines: Compare "Base RAG" (pure semantic similarity) and "Filtered RAG" (semantic similarity with keyword filtering) for optimal answers tailored to your needs.
Supports Multiple Languages: Automatically chunks and indexes code from various languages including Python, JavaScript, Java, C++, TypeScript, Markdown, and more.
Powered by Gemini 2.5 Pro: Leverages Google's latest generative AI model for superior code understanding and response generation.
Supabase Integration: Utilizes Supabase for efficient vector storage and retrieval of code chunks.
Netlify Functions: Backend logic is deployed as serverless functions on Netlify, ensuring scalability and ease of deployment.
Intuitive UI: A clean and responsive user interface built with React and Tailwind CSS.

🚀 Getting Started

Follow these steps to set up and run Code Expert locally or deploy it.

Prerequisites

Before you begin, ensure you have the following installed:

Node.js: Version 18 or higher.
npm: Version 8 or higher (comes with Node.js).
Git: For cloning the repository.
A Supabase Project: You'll need a Supabase project URL and a service_role key.
A Google Cloud Project: With the Generative AI API enabled and an API Key.

Installation

Clone the repository:

git clone https://github.com/AryamanGupta001/Code-Expert.git
cd Code-Expert

Install dependencies:
```
npm install
```

Environment Variables

Code Expert relies on several environment variables for its functionality.

Create a .env file: Copy the .env.example file to .env in the root of your project:
```
cp .env.example .env
```
Populate .env: Open the newly created .env file and fill in your credentials:
```
# .env
GITHUB_TOKEN=YOUR_GITHUB_TOKEN_HERE # Optional: For cloning private repositories
SUPABASE_URL=YOUR_SUPABASE_URL_HERE
SUPABASE_SERVICE_KEY=YOUR_SUPABASE_SERVICE_KEY_HERE
GEMINI_API_KEY=YOUR_GEMINI_API_KEY_HERE
```
- GITHUB_TOKEN: (Optional) A GitHub Personal Access Token with repo scope if you plan to index private repositories.
- SUPABASE_URL: Your Supabase project URL, found in your Supabase project settings.
- SUPABASE_SERVICE_KEY: Your Supabase service_role key, found under Project Settings > API Keys in your Supabase dashboard. Keep this key secure and do not expose it in client-side code.
- GEMINI_API_KEY: Your Google AI Studio API Key with access to the Gemini API. Ensure the Generative Language API is enabled in your Google Cloud project.
Netlify Environment Variables (for deployment): If deploying to Netlify, you must also configure these environment variables in your Netlify dashboard: Site Settings > Build & deploy > Environment. Add each key exactly as above.

Supabase Database Setup

Code Expert uses a PostgreSQL database with the pgvector extension for storing and querying code embeddings.

Enable pgvector extension: In your Supabase project, navigate to Database > Extensions and enable pgvector.

Create code_chunks table and match_chunks_by_embedding function: Go to your Supabase SQL Editor and run the following SQL commands:

create extension if not exists vector;

create table if not exists code_chunks (
  id uuid primary key default gen_random_uuid(),
  repo_id text not null,
  file_path text not null,
  content text not null,
  embedding vector(768), -- Default for Xenova/microsoft-codebert-base
  metadata jsonb,
  created_at timestamp with time zone default now()
);

create function match_chunks_by_embedding(
  query_embedding vector(768), -- Matches the embedding model dimension
  repo_filter text,
  k int
) returns table (
  id uuid,
  repo_id text,
  file_path text,
  content text,
  embedding vector(768), -- Matches the embedding model dimension
  metadata jsonb,
  created_at timestamp with time zone,
  distance float
) as $$
begin
  return query
  select *, (embedding <=> query_embedding) as distance
  from code_chunks
  where repo_id = repo_filter
  order by embedding <=> query_embedding
  limit k;
end;
$$ language plpgsql;

Note: The embedding column type and query_embedding parameter in the match_chunks_by_embedding function are set to vector(768) to match the Xenova/microsoft-codebert-base model used in this project. If you change the embedding model, ensure you update these dimensions accordingly.

🏃 Usage

Local Development

To run the application locally, you'll use Netlify CLI to serve both the frontend and the Netlify Functions.

Install Netlify CLI (if you haven't already):
```
npm install -g netlify-cli
```
Start the development server:
```
netlify dev
```
This will typically start the frontend at http://localhost:8888 and expose your Netlify Functions at http://localhost:8888/.netlify/functions/<functionName>.

Processing a GitHub Repository

Open the application in your browser (e.g., http://localhost:8888).
Navigate to the "Live Demo" section.
Paste a public GitHub repository URL (e.g., https://github.com/facebook/react) into the input field.
Click the "Process Repo" button.
The application will clone the repository, chunk its code files, generate embeddings, and store them in your Supabase database. This process can take some time depending on the size of the repository. A success message will appear once indexing is complete.

Chatting with Your Codebase

Once a repository has been successfully indexed:

The chat interface will automatically appear below the "Live Demo" section.
Type your question about the codebase into the input field (e.g., "What does the UserService do?").
Choose between "Base RAG" and "Filtered RAG" variants:
- Base RAG: Retrieves code chunks purely based on semantic similarity to your question.
- Filtered RAG: Applies additional keyword filtering to prioritize more specific and relevant chunks, often leading to more precise answers for detailed questions.
Click the "Send" button.
Code Expert will retrieve relevant code snippets, use them as context for Gemini 2.5 Pro, and provide a grounded answer. You can also view the metrics (context relevance, groundedness) and the source files used to generate the answer.

⚙️ Configuration Options

All configuration is managed via environment variables as described in the Environment Variables section.

📚 API Documentation

Code Expert exposes two primary Netlify Functions as its backend API:

`POST /.netlify/functions/processRepo`

Description: Clones a specified GitHub repository, chunks its code files, generates embeddings, and stores them in the Supabase database.

Request Body:

{
  "githubUrl": "string" // The URL of the GitHub repository to process
}

Response:

{
  "status": "success",
  "repo_id": "string",    // A unique ID (SHA256 hash) for the processed repository
  "total_chunks": number  // The total number of code chunks processed
}

Error Response:

{
  "error": "string" // Description of the error
}

`POST /.netlify/functions/chat`

Description: Answers a question about an already indexed repository using Retrieval Augmented Generation (RAG).

Request Body:

{
  "repo_id": "string",          // The unique ID of the indexed repository
  "question": "string",         // The question to ask about the codebase
  "variant": "base" | "filtered" // The RAG variant to use
}

Response:

{
  "answer": "string", // The AI-generated answer
  "metrics": {
    "context_relevance": number,     // How relevant the retrieved context was (0-1)
    "groundedness": number,          // How much of the answer is supported by the context (0-1)
    "num_chunks_retrieved": number   // Number of code chunks used for generation
  },
  "sources": [
    {
      "file_path": "string", // Path to the source file
      "distance": number     // Semantic distance from the question embedding
    }
    // ... more source objects
  ]
}

Error Response:

{
  "error": "string" // Description of the error
}

🤝 Contributing

We welcome contributions to Code Expert! If you'd like to contribute, please follow these steps:

Fork the repository.
Clone your forked repository: git clone https://github.com/AryamanGupta001/Code-Expert.git
Create a new branch: git checkout -b feature/your-feature-name
Make your changes and ensure they adhere to the existing code style.
Commit your changes: git commit -m "feat: Add new feature"
Push to your branch: git push origin feature/your-feature-name
Open a Pull Request to the main branch of the original repository.

Please ensure your code compiles without errors and passes all checks.

⚠️ Known Limitations & Bugs

Processing Time: Indexing large repositories can take a significant amount of time (up to 60 seconds or more) due to cloning, chunking, and embedding processes.
Cold Start Latency: The embedding model (Xenova/microsoft-codebert-base) can incur a cold start delay on Netlify Functions, leading to initial requests being slower.
API Rate Limits: Heavy usage might hit rate limits on GitHub (for cloning) or Google Gemini API.
Public Repositories Only: By default, only public GitHub repositories can be processed. Support for private repositories requires a GITHUB_TOKEN with appropriate permissions.
File Type Support: Only specific code file extensions are processed (.py, .js, .java, .cpp, .ts, .tsx, .md).

🗺️ Roadmap

Here are some planned features and future improvements:

Enhanced Private Repo Support: More robust authentication for private repositories (e.g., OAuth).
Alternative LLM Integrations: Support for other generative AI models beyond Gemini.
Advanced Chunking Strategies: Implement AST-based or semantic chunking for more intelligent code segmentation.
User Authentication & History: Allow users to log in and persist their chat history and indexed repositories.
Web UI for Repo Management: A dedicated interface to view, manage, and delete indexed repositories.
Improved Metrics & Analytics: More detailed insights into RAG performance.
Streaming Responses: Implement server-sent events for real-time AI response generation.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Credits & Acknowledgments

Code Expert was developed by Aryaman Gupta.

Special thanks to:

Netlify: For providing the serverless functions and hosting platform.
Supabase: For the powerful PostgreSQL database and pgvector extension.
Google Gemini: For the advanced generative AI capabilities.
Hugging Face Transformers.js: For the client-side embedding models.
All contributors: Who help make this project better.

📧 Contact

For questions, feedback, or support, please open an issue on the GitHub repository or contact the maintainer:

Aryaman Gupta: LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
netlify/functions		netlify/functions
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
netlify.toml		netlify.toml
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code Expert – Free AI-Powered Codebase Q&A

✨ Key Features & Benefits

🚀 Getting Started

Prerequisites

Installation

Environment Variables

Supabase Database Setup

🏃 Usage

Local Development

Processing a GitHub Repository

Chatting with Your Codebase

⚙️ Configuration Options

📚 API Documentation

`POST /.netlify/functions/processRepo`

`POST /.netlify/functions/chat`

🤝 Contributing

⚠️ Known Limitations & Bugs

🗺️ Roadmap

📄 License

🙏 Credits & Acknowledgments

📧 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Code Expert – Free AI-Powered Codebase Q&A

✨ Key Features & Benefits

🚀 Getting Started

Prerequisites

Installation

Environment Variables

Supabase Database Setup

🏃 Usage

Local Development

Processing a GitHub Repository

Chatting with Your Codebase

⚙️ Configuration Options

📚 API Documentation

POST /.netlify/functions/processRepo

POST /.netlify/functions/chat

🤝 Contributing

⚠️ Known Limitations & Bugs

🗺️ Roadmap

📄 License

🙏 Credits & Acknowledgments

📧 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`POST /.netlify/functions/processRepo`

`POST /.netlify/functions/chat`

Packages