Skip to content

Commit cb1dc7c

Browse files
authored
feat: Code Quality Improvements & Documentation Overhaul
* feat: add comprehensive error handling, logging, and type hints - Add structured logging with file and console output across all modules - Implement robust error handling with graceful degradation - Add comprehensive type hints to all functions and methods - Add concise docstrings for all public functions - Improve validation and input checking throughout codebase - Enhance OpenAI API error handling with better user messages * feat: enhance Streamlit UI with better UX and error handling - Add improved status indicators and connection validation - Implement conversation context retention within sessions - Add loading states and user-friendly error messages - Include sidebar controls with clear conversation functionality - Add example queries and helpful tips for new users - Improve page layout and visual feedback * feat: add code quality tools and pre-commit configuration - Add pre-commit hooks with Black, Flake8, isort, and MyPy - Configure pyproject.toml for consistent code formatting - Set up automated code quality checks on commits - Include trailing whitespace and file formatting hooks - Configure type checking and import sorting standards * ci: add GitHub Actions workflow for automated code quality checks - Create CI pipeline testing Python 3.8-3.12 compatibility - Add automated Black, isort, Flake8, and MyPy checks - Include import structure validation for all modules - Set up continuous integration for pull requests and pushes - Enable early detection of code quality issues * docs: modernize README with comprehensive documentation - Convert from RST to Markdown format with modern styling - Add badges for Python version, license, and build status - Include architecture diagram and visual project overview - Add detailed quick start guide and usage examples - Provide comprehensive troubleshooting section - Include contribution guidelines and development setup * docs: add comprehensive development guide - Create detailed developer setup instructions - Add code quality standards and guidelines - Include testing and debugging tips - Provide architecture overview and project structure - Document common development issues and solutions * fix: apply code formatting and linting fixes - Apply Black code formatting to all Python files - Fix import sorting with isort - Resolve all Flake8 linting issues - Fix MyPy type checking errors - Remove unused imports and variables - Fix line length violations and formatting inconsistencies - Add proper type annotations for global variables - Add test_env to .gitignore * fix: resolve Python 3.8 compatibility issues - Create separate requirements-py38.txt for Python 3.8 compatibility - Use numpy>=1.21.0,<1.25.0 for Python 3.8 (numpy 1.26.4 requires Python 3.9+) - Use pandas>=1.5.0,<2.1.0 for Python 3.8 compatibility - Update Python 3.8 workflow to use Python 3.8 compatible requirements - Update cache key to reference correct requirements file * simplify: streamline CI/CD pipeline to Python 3.11 only - Remove Python 3.8 compatibility workflow and requirements - Simplify code quality workflow to use single Python 3.11 version - Update pyproject.toml configurations to target Python 3.11 - Reduce CI complexity while maintaining code quality checks * refactor: simplify CI/CD - remove code quality checks - Remove Black, isort, Flake8, and MyPy checks from CI/CD - Code quality should be enforced via pre-commit hooks locally - Rename workflow from 'Code Quality' to 'CI Tests' - Keep only dependency installation and import structure tests - Prevents PR failures due to formatting issues * feat: apply PR review fixes - Centralize logging in entrypoints; Streamlit force logging; gate file logs via env - Embeddings: chunk + mean pool, retry/backoff, timeouts - Similarity: switch to cosine (L2-normalize + IndexFlatIP); show proper score - Metadata: truncate stored content to keep index lean - Config: default WATCHED_DIR to cwd - Tests: remove OpenAI dependency; dummy vector test - CI: add lint/mypy/pytest job; README 3.11+ - Docs: add AGENTS.md contributor guide * style: fix import order per isort * ci: align flake8 flags with project (88 cols, ignore E203,W503) * ci: ensure coderag is importable during pytest (set PYTHONPATH)
1 parent 83977e3 commit cb1dc7c

File tree

19 files changed

+1368
-246
lines changed

19 files changed

+1368
-246
lines changed

.github/workflows/ci-tests.yml

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
name: CI Tests
2+
3+
on:
4+
push:
5+
branches: [ main, master, develop ]
6+
pull_request:
7+
branches: [ main, master, develop ]
8+
9+
jobs:
10+
test-imports:
11+
runs-on: ubuntu-latest
12+
13+
steps:
14+
- uses: actions/checkout@v4
15+
16+
- name: Set up Python 3.11
17+
uses: actions/setup-python@v5
18+
with:
19+
python-version: '3.11'
20+
21+
- name: Cache pip dependencies
22+
uses: actions/cache@v4
23+
with:
24+
path: ~/.cache/pip
25+
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
26+
restore-keys: |
27+
${{ runner.os }}-pip-
28+
29+
- name: Install dependencies
30+
run: |
31+
python -m pip install --upgrade pip
32+
pip install -r requirements.txt
33+
34+
- name: Test Import Structure
35+
run: |
36+
python -c "import coderag.config; print('✓ Config import successful')"
37+
python -c "import coderag.embeddings; print('✓ Embeddings import successful')"
38+
python -c "import coderag.index; print('✓ Index import successful')"
39+
python -c "import coderag.search; print('✓ Search import successful')"
40+
python -c "import coderag.monitor; print('✓ Monitor import successful')"
41+
env:
42+
OPENAI_API_KEY: dummy-key-for-testing
43+
44+
quality-and-tests:
45+
runs-on: ubuntu-latest
46+
steps:
47+
- uses: actions/checkout@v4
48+
- name: Set up Python 3.11
49+
uses: actions/setup-python@v5
50+
with:
51+
python-version: '3.11'
52+
- name: Install dependencies
53+
run: |
54+
python -m pip install --upgrade pip
55+
pip install -r requirements.txt
56+
pip install black flake8 isort mypy pytest
57+
- name: Lint and type-check
58+
run: |
59+
black --check .
60+
isort --check-only .
61+
flake8 . --max-line-length=88 --ignore=E203,W503
62+
mypy .
63+
- name: Run tests
64+
env:
65+
PYTHONPATH: ${{ github.workspace }}
66+
run: pytest -q

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,3 +27,5 @@ node_modules/
2727
*.tmp
2828
plan.md
2929
metadata.npy
30+
test_env/
31+
*.npy

.pre-commit-config.yaml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
repos:
2+
- repo: https://github.com/psf/black
3+
rev: 23.12.1
4+
hooks:
5+
- id: black
6+
language_version: python3
7+
args: ['--line-length=88']
8+
9+
- repo: https://github.com/pycqa/flake8
10+
rev: 7.0.0
11+
hooks:
12+
- id: flake8
13+
args: ['--max-line-length=88', '--ignore=E203,W503']
14+
15+
- repo: https://github.com/pycqa/isort
16+
rev: 5.13.2
17+
hooks:
18+
- id: isort
19+
args: ["--profile", "black"]
20+
21+
- repo: https://github.com/pre-commit/mirrors-mypy
22+
rev: v1.8.0
23+
hooks:
24+
- id: mypy
25+
additional_dependencies: [types-all]
26+
args: [--ignore-missing-imports, --no-strict-optional]
27+
28+
- repo: https://github.com/pre-commit/pre-commit-hooks
29+
rev: v4.5.0
30+
hooks:
31+
- id: trailing-whitespace
32+
- id: end-of-file-fixer
33+
- id: check-yaml
34+
- id: check-added-large-files
35+
- id: check-merge-conflict
36+
- id: debug-statements

AGENTS.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Repository Guidelines
2+
3+
## Project Structure & Module Organization
4+
- `coderag/`: Core library (`config.py`, `embeddings.py`, `index.py`, `search.py`, `monitor.py`).
5+
- `app.py`: Streamlit UI. `main.py`: backend/indexer. `prompt_flow.py`: RAG orchestration.
6+
- `scripts/`: Utilities (e.g., `initialize_index.py`, `run_monitor.py`).
7+
- `tests/`: Minimal checks (e.g., `test_faiss.py`).
8+
- `example.env` → copy to `.env` for local secrets; CI lives in `.github/`.
9+
10+
## Build, Test, and Development Commands
11+
- Create env: `python -m venv venv && source venv/bin/activate`.
12+
- Install deps: `pip install -r requirements.txt`.
13+
- Run backend: `python main.py` (indexes and watches `WATCHED_DIR`).
14+
- Run UI: `streamlit run app.py`.
15+
- Quick test: `python tests/test_faiss.py` (FAISS round‑trip sanity check).
16+
- Quality suite: `pre-commit run --all-files` (black, isort, flake8, mypy, basics).
17+
18+
## Coding Style & Naming Conventions
19+
- Formatting: Black (88 cols), isort profile "black"; run `black . && isort .`.
20+
- Linting: flake8 with `--ignore=E203,W503` to match Black.
21+
- Typing: mypy (py311 target; ignore missing imports OK). Prefer typed signatures and docstrings.
22+
- Indentation: 4 spaces. Names: `snake_case` for files/functions, `PascalCase` for classes, constants `UPPER_SNAKE`.
23+
- Imports: first‑party module is `coderag` (see `pyproject.toml`).
24+
25+
## Testing Guidelines
26+
- Place tests in `tests/` as `test_*.py`. Keep unit tests deterministic; mock OpenAI calls where possible.
27+
- Run directly (`python tests/test_faiss.py`) or with pytest if available (`pytest -q`).
28+
- Ensure `.env` or env vars provide `OPENAI_API_KEY` for integration tests; avoid hitting rate limits in CI.
29+
30+
## Commit & Pull Request Guidelines
31+
- Use Conventional Commits seen in history: `feat:`, `fix:`, `docs:`, `ci:`, `refactor:`, `simplify:`.
32+
- Before pushing: `pre-commit run --all-files` and update docs when behavior changes.
33+
- PRs: clear description, linked issues, steps to validate; include screenshots/GIFs for UI changes; note config changes (`.env`).
34+
35+
## Security & Configuration Tips
36+
- Never commit secrets. Start with `cp example.env .env`; set `OPENAI_API_KEY`, `WATCHED_DIR`, `FAISS_INDEX_FILE`.
37+
- Avoid logging sensitive data. Regenerate the FAISS index if dimensions or models change (`python scripts/initialize_index.py`).

DEVELOPMENT.md

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
# 🛠️ Development Guide
2+
3+
## Setting Up Development Environment
4+
5+
### 1. Clone and Setup
6+
7+
```bash
8+
git clone https://github.com/your-username/CodeRAG.git
9+
cd CodeRAG
10+
python -m venv venv
11+
source venv/bin/activate # Windows: venv\Scripts\activate
12+
pip install -r requirements.txt
13+
```
14+
15+
### 2. Configure Pre-commit Hooks
16+
17+
```bash
18+
pip install pre-commit
19+
pre-commit install
20+
```
21+
22+
This will run code quality checks on every commit:
23+
- **Black**: Code formatting
24+
- **isort**: Import sorting
25+
- **Flake8**: Linting and style checks
26+
- **MyPy**: Type checking
27+
- **Basic hooks**: Trailing whitespace, file endings, etc.
28+
29+
### 3. Environment Variables
30+
31+
Copy `example.env` to `.env` and configure:
32+
33+
```bash
34+
cp example.env .env
35+
```
36+
37+
Required variables:
38+
```env
39+
OPENAI_API_KEY=your_key_here # Required for embeddings and chat
40+
WATCHED_DIR=/path/to/code # Directory to index (default: current dir)
41+
```
42+
43+
## Code Quality Standards
44+
45+
### Type Hints
46+
All functions should have type hints:
47+
48+
```python
49+
def process_file(filepath: str, content: str) -> Optional[np.ndarray]:
50+
\"\"\"Process a file and return embeddings.\"\"\"
51+
...
52+
```
53+
54+
### Error Handling
55+
Use structured logging and proper exception handling:
56+
57+
```python
58+
import logging
59+
logger = logging.getLogger(__name__)
60+
61+
try:
62+
result = risky_operation()
63+
except SpecificError as e:
64+
logger.error(f"Operation failed: {str(e)}")
65+
return None
66+
```
67+
68+
### Documentation
69+
Use concise docstrings for public functions:
70+
71+
```python
72+
def search_code(query: str, k: int = 5) -> List[Dict[str, Any]]:
73+
\"\"\"Search the FAISS index using a text query.
74+
75+
Args:
76+
query: The search query text
77+
k: Number of results to return
78+
79+
Returns:
80+
List of search results with metadata
81+
\"\"\"
82+
```
83+
84+
## Testing Your Changes
85+
86+
### Manual Testing
87+
```bash
88+
# Test backend indexing
89+
python main.py
90+
91+
# Test Streamlit UI (separate terminal)
92+
streamlit run app.py
93+
```
94+
95+
### Code Quality Checks
96+
```bash
97+
# Format code
98+
black .
99+
isort .
100+
101+
# Check linting
102+
flake8 .
103+
104+
# Type checking
105+
mypy .
106+
107+
# Run all pre-commit checks
108+
pre-commit run --all-files
109+
```
110+
111+
## Adding New Features
112+
113+
1. **Create feature branch**: `git checkout -b feature/new-feature`
114+
2. **Add logging**: Use the logger for all operations
115+
3. **Add type hints**: Follow existing patterns
116+
4. **Handle errors**: Graceful degradation and user-friendly messages
117+
5. **Update tests**: Add tests for new functionality
118+
6. **Update docs**: Update README if needed
119+
120+
## Architecture Guidelines
121+
122+
### Keep It Simple
123+
- Maintain the single-responsibility principle
124+
- Avoid unnecessary abstractions
125+
- Focus on the core RAG functionality
126+
127+
### Error Handling Strategy
128+
- Log errors with context
129+
- Return None/empty lists for failures
130+
- Show user-friendly messages in UI
131+
- Don't crash the application
132+
133+
### Performance Considerations
134+
- Limit search results (default: 5)
135+
- Truncate long content for context
136+
- Cache embeddings when possible
137+
- Monitor memory usage with large codebases
138+
139+
## Debugging Tips
140+
141+
### Enable Debug Logging
142+
```python
143+
logging.basicConfig(level=logging.DEBUG)
144+
```
145+
146+
### Check Index Status
147+
```python
148+
from coderag.index import inspect_metadata
149+
inspect_metadata(5) # Show first 5 entries
150+
```
151+
152+
### Test Embeddings
153+
```python
154+
from coderag.embeddings import generate_embeddings
155+
result = generate_embeddings("test code")
156+
print(f"Shape: {result.shape if result is not None else 'None'}")
157+
```
158+
159+
## Common Development Issues
160+
161+
**Import Errors**
162+
- Ensure you're in the virtual environment
163+
- Check PYTHONPATH includes project root
164+
- Verify all dependencies are installed
165+
166+
**OpenAI API Issues**
167+
- Check API key validity
168+
- Monitor rate limits and usage
169+
- Test with a simple embedding request
170+
171+
**FAISS Index Corruption**
172+
- Delete existing index files and rebuild
173+
- Check file permissions
174+
- Ensure consistent embedding dimensions
175+
176+
## Project Structure
177+
178+
```
179+
CodeRAG/
180+
├── coderag/ # Core library
181+
│ ├── __init__.py
182+
│ ├── config.py # Configuration management
183+
│ ├── embeddings.py # OpenAI integration
184+
│ ├── index.py # FAISS operations
185+
│ ├── search.py # Search functionality
186+
│ └── monitor.py # File monitoring
187+
├── scripts/ # Utility scripts
188+
├── tests/ # Test files
189+
├── .github/ # GitHub workflows
190+
├── main.py # Backend service
191+
├── app.py # Streamlit frontend
192+
├── prompt_flow.py # RAG orchestration
193+
└── requirements.txt # Dependencies
194+
```

0 commit comments

Comments
 (0)