Thank you for your interest in contributing to LLMKube! This project aims to make local LLM deployment as reliable and scalable as deploying microservices on Kubernetes.
- Code of Conduct
- How Can I Contribute?
- Development Setup
- Making Changes
- Pull Request Process
- Coding Standards
- Community
We are committed to providing a welcoming and inspiring community for all. Please be respectful and constructive in all interactions.
Expected Behavior:
- Use welcoming and inclusive language
- Be respectful of differing viewpoints and experiences
- Gracefully accept constructive criticism
- Focus on what is best for the community
- Show empathy towards other community members
Unacceptable Behavior:
- Harassment, discriminatory language, or personal attacks
- Trolling, insulting/derogatory comments
- Public or private harassment
- Publishing others' private information without permission
Before creating bug reports, please check existing issues to avoid duplicates.
Good bug reports include:
- Clear, descriptive title
- Exact steps to reproduce the problem
- Expected vs actual behavior
- LLMKube version (
llmkube version) - Kubernetes version (
kubectl version) - Environment details (GKE, EKS, minikube, etc.)
- Relevant logs from controller or inference pods
- Sample YAML manifests (if applicable)
Use the bug report template when creating issues.
We track feature requests via GitHub Issues with the enhancement label.
Good feature requests include:
- Clear use case and problem statement
- Proposed solution (if you have one)
- Alternatives you've considered
- Impact on existing functionality
- Examples of similar features in other projects
Look for issues labeled:
good-first-issue- Small, well-defined taskshelp-wanted- Larger tasks where we need helpdocumentation- Documentation improvements
High Priority (Phase 2+):
- Multi-platform CLI builds (GoReleaser workflow improvements)
- Multi-GPU single-node support (13B models on 2x GPUs)
- AMD GPU support (ROCm integration)
- Intel GPU support (oneAPI)
- Additional Grafana dashboards (inference metrics, cost tracking)
- Performance benchmarking on different GPU types
Medium Priority:
- Multi-node GPU sharding (70B models across multiple nodes)
- Edge deployment guides (K3s, ARM64, Jetson)
- Helm chart packaging
- Advanced SLO auto-remediation
- Cost optimization features
Completed (Phase 0-1):
- ✅ NVIDIA GPU support with CUDA
- ✅ Prometheus + Grafana + DCGM metrics
- ✅ GPU E2E test suite
- ✅ GPU dashboard templates
- ✅ CLI with GPU deployment support
Required:
- Go 1.24+: Install from golang.org
- Docker 17.03+: For building images
- kubectl: Configured with a cluster (minikube, kind, GKE, etc.)
- Kubernetes cluster: v1.11.3+ (minikube, kind, or cloud)
Optional (for GPU development):
- GPU cluster: GKE with NVIDIA GPUs (see
terraform/gke/) - NVIDIA GPU Operator: Installed on cluster
- CUDA knowledge: For GPU inference work
# Fork the repository on GitHub first, then:
git clone git@github.com:YOUR_USERNAME/llmkube.git
cd llmkube
# Install dependencies
go mod download
# Generate CRD manifests
make manifests
# Build the operator binary
make build
# Build the CLI
make build-cli
# Run tests
make test# Install CRDs into your cluster
make install
# Run the controller locally (outside cluster)
make run
# In another terminal, deploy a test model
kubectl apply -f config/samples/inference_v1alpha1_model.yaml
kubectl apply -f config/samples/inference_v1alpha1_inferenceservice.yaml# Build and push Docker image
export IMG=<your-registry>/llmkube:dev
make docker-build docker-push IMG=$IMG
# Deploy to cluster
make deploy IMG=$IMG
# Check controller logs
kubectl logs -n llmkube-system deployment/llmkube-controller-manager -fmain- Stable, production-ready codefeat/*- New featuresfix/*- Bug fixesdocs/*- Documentation changesrefactor/*- Code refactoring
Example:
git checkout -b feat/prometheus-metricsWe follow Conventional Commits:
<type>(<scope>): <subject>
<body>
<footer>
Types:
feat: New featurefix: Bug fixdocs: Documentation onlyrefactor: Code refactoring (no functional change)test: Adding testschore: Build process, tooling changes
Examples:
feat(controller): Add Prometheus metrics exporter
Adds token/sec, latency, and request count metrics to the
InferenceService controller. Metrics are exposed on :8080/metrics.
Closes #42
fix(model): Handle missing GGUF metadata gracefully
Previously crashed when GGUF files had incomplete metadata.
Now logs warning and continues with defaults.
Fixes #55
- Write tests for new functionality
- Update documentation (README, code comments)
- Run linters:
make lint - Run tests:
make test - Test manually in a real cluster
# Unit tests
make test
# E2E tests (requires cluster)
make test-e2e
# GPU E2E tests (requires GPU cluster)
./test/e2e/gpu_test.sh
# Lint
make lint
# Verify manifests are up to date
make manifests
git diff --exit-code # Should be no changesWriting Tests:
- Use table-driven tests for multiple cases
- Test both success and error paths
- Mock external dependencies (K8s API, HTTP downloads)
- Test CRD validation and defaulting
- Add GPU-specific tests for GPU features
- Verify metrics are properly exported
- Code passes all tests (
make test) - Linter passes (
make lint) - Manually tested in a Kubernetes cluster
- GPU tests pass (if GPU-related:
./test/e2e/gpu_test.sh) - Documentation updated (if adding features)
- Commit messages follow conventions
- Branch is up-to-date with
main - No hardcoded credentials or project IDs
- Push your branch to your fork
- Open a Pull Request against
main - Fill out the PR template completely
- Link related issues (e.g., "Closes #42")
- Request review from maintainers
Important: PR titles are used for automated changelog generation and version bumps via Release Please. Since we use squash merging, your PR title becomes the commit message on main.
<type>: <description>
| Type | Description | Version Bump |
|---|---|---|
feat: |
New feature | Minor (0.x.0) |
fix: |
Bug fix | Patch (0.0.x) |
docs: |
Documentation only | None |
ci: |
CI/CD changes | None |
chore: |
Maintenance tasks | None |
refactor: |
Code refactoring | None |
test: |
Adding tests | None |
Examples:
feat: Add Prometheus metrics to InferenceService controllerfix: Resolve model download timeout on slow connectionsdocs: Add troubleshooting guide for GPU issuesfeat(helm): Add image digest support for production deployments
Scopes (optional): Use parentheses to specify the area of change: feat(cli):, fix(controller):, docs(helm):, etc.
- Automated checks run (tests, linting, builds)
- Maintainer review (usually within 2-3 days)
- Address feedback by pushing new commits
- Approval and merge (squash-merge to
main)
What reviewers look for:
- Code quality and readability
- Test coverage
- Performance implications
- Backward compatibility
- Documentation completeness
- Follow Effective Go
- Use
gofmtandgolangci-lint - Keep functions small and focused
- Add comments for exported types/functions
- Use descriptive variable names
Example:
// ReconcileModel downloads and validates the model specified in the Model CRD.
// It updates the Model status with download progress and any errors encountered.
func (r *ModelReconciler) ReconcileModel(ctx context.Context, model *inferencev1alpha1.Model) error {
// Implementation
}- Follow Kubernetes API conventions
- Use
+kubebuildermarkers for validation - Provide meaningful status conditions
- Add examples in
config/samples/
- Code comments: Explain why, not what
- README updates: For user-facing changes
- Architecture docs: For design decisions
- Examples: Working YAML manifests
- Discord: Join the community — real-time chat, support, and contributor coordination
- GitHub Issues: Bug reports, feature requests
- GitHub Discussions: Q&A, ideas, general discussion
When: Mondays, 10am PT (details in GitHub Discussions) Agenda: Review PRs, discuss roadmap, unblock contributors
Contributors will be:
- Listed in release notes
- Added to CONTRIBUTORS file
- Thanked in community updates
Here's a complete workflow for adding a feature:
# 1. Create feature branch
git checkout -b feat/multi-gpu-single-node
# 2. Make changes
vim api/v1alpha1/model_types.go
vim internal/controller/inferenceservice_controller.go
# 3. Generate manifests
make manifests
# 4. Write tests
vim internal/controller/inferenceservice_controller_test.go
# 5. Run tests
make test
# 6. Test manually (CPU)
make install
make run
kubectl apply -f config/samples/inference_v1alpha1_model.yaml
# 7. Test with GPU (if GPU feature)
kubectl apply -f config/samples/gpu-model-example.yaml
./test/e2e/gpu_test.sh
# 8. Lint
make lint
# 9. Update documentation
vim README.md
vim docs/gpu-setup-guide.md # If GPU-related
# 10. Commit
git add .
git commit -m "feat(gpu): Add multi-GPU single-node support
Enables layer offloading across multiple GPUs on single node.
Adds spec.hardware.gpu.count field to Model CRD.
Verified with 13B model on 2x L4 GPUs.
Closes #123"
# 11. Push and create PR
git push origin feat/multi-gpu-single-node
# Open PR on GitHubIf you're stuck or have questions:
- Check existing documentation (README, ROADMAP, code comments)
- Search closed issues and PRs
- Ask in GitHub Discussions (when available)
- Ping maintainers in your PR
By contributing to LLMKube, you agree that your contributions will be licensed under the Apache License 2.0.
Thank you for contributing to LLMKube! Every PR, issue report, and doc improvement helps make local LLM deployment accessible to everyone.