Thank you for your interest in contributing to DocuQuery. This project is designed as a production-oriented RAG system, and we welcome contributions that improve its performance, reliability, and usability.
Create your own fork of the repository and clone it locally.
Use a clear and descriptive branch name:
feature/
fix/
improvement/
Examples:
- feature/add-hybrid-search
- fix/retrieval-threshold-bug
- improvement/api-response-format
- Keep changes focused and minimal
- Follow existing project structure and naming conventions
- Add comments where necessary for clarity
All contributions should include appropriate test coverage:
- Unit tests for core logic
- Integration tests for pipelines or APIs
- Ensure all existing tests continue to pass
Validation Checklist:
Before submitting your changes, please ensure the following checks pass:
pytest
python -m app.run_pipeline
python -m eval.run_eval
uvicorn app.main:app --reloadWhen opening a PR, include:
- Clear description of the change
- Motivation for the change
- Any relevant screenshots or logs (if applicable)
- Keep PRs small and focused
- Avoid unrelated changes in a single PR
- Ensure code is clean and readable
- Respond to review comments promptly
All pull requests will be reviewed before merging.
- Follow consistent Python formatting
- Use meaningful variable and function names
- Keep functions modular and reusable
- Avoid unnecessary complexity
We actively welcome improvements in:
- Retrieval quality (ranking, filtering)
- Evaluation metrics and datasets
- Performance optimizations
- API improvements
- Documentation clarity
- Testing coverage
If you encounter a bug or have a feature request:
- Use the GitHub Issues tab
- Provide clear steps to reproduce (for bugs)
- Include logs or error messages if available
DocuQuery is built with a focus on:
- Reliability over shortcuts
- Clarity over cleverness
- Measurable improvements over assumptions
Contributions aligning with these principles are highly encouraged.
Aditya Bisoyi
Software Engineer focused on scalable systems and applied machine learning