feat: remove ollama pull feature and add mistralrs local embedding support#851
Closed
starpit wants to merge 1 commit into
Closed
feat: remove ollama pull feature and add mistralrs local embedding support#851starpit wants to merge 1 commit into
starpit wants to merge 1 commit into
Conversation
583baf8 to
1667d1e
Compare
…pport BREAKING CHANGE: Removed automatic ollama model pulling functionality This commit removes ollama installation and model pulling from CI/CD workflows and adds support for local embedding models via mistral.rs backend. Docker & Infrastructure Cleanup: - Deleted docker/Containerfile.hostbuild (14 lines removed) - Deleted docker/Containerfile.hostbuild.ollama (30 lines removed) - Removed ollama installation from docker/gce/vllm/setup.sh - Removed ollama installation from docker/gce/vllm/setup-dev.sh - Removed ollama installation from docker/gce/vllm/create-vllm-gce-image.sh - Removed ollama health check from docker/gce/vllm/test.d/spnl-speedup.sh CI/CD Workflow Changes (.github/workflows/core.yml): - Removed ollama installation step (Linux curl install, macOS brew install) - Removed ollama systemd service startup logic - Simplified to single disk space cleanup step - Reduced workflow complexity by 12 lines Embedding Backend Implementation: - Added spnl/src/generate/backend/mistralrs/embed.rs (37 lines) * New embed() function using mistralrs EmbeddingModelBuilder * Supports local embedding models with device detection * Converts EmbedData to text strings and generates embeddings - Updated spnl/src/generate/backend/mistralrs/mod.rs to export embed module Embedding System Refactoring (spnl/src/augment/embed.rs): - Moved contentify() helper from openai.rs to embed.rs for shared use - Added support for local/ prefix to use mistralrs backend - Refactored embed() to return Vec<Vec<f32>> consistently across backends - Changed all backend calls to collect() results into Vec for uniform handling - Improved error handling with proper Result propagation OpenAI Backend Cleanup (spnl/src/generate/backend/openai.rs): - Removed duplicate contentify() function (18 lines) - Updated embed() to use shared contentify() from augment::embed module - Simplified code by removing redundant helper function CLI Default Model Changes (cli/src/args.rs): - Changed default generative model from 'ollama/granite3.3:2b' to 'llama3.2:1b' - Changed default embedding model from 'ollama/mxbai-embed-large:335m' to 'local/google/embeddinggemma-300m' - Updated to use local embedding models by default Cargo Configuration (spnl/Cargo.toml): - Added 'local' feature to default features list - Enables mistralrs local embedding support by default - Removed unused tokio-util dependency (not referenced in any features or code) Index & RAG Updates: - Updated spnl/src/augment/index/mod.rs to use new embed signature - Removed pull_if_needed call from spnl/src/augment/index/raptor.rs Build System: - Updated .github/scripts/free-up-disk-space-fast.sh with additional cleanup Net change: -124 lines, +88 lines (36 lines removed overall) Users will need to manually pull ollama models before use if using ollama backend. Signed-off-by: Nick Mitchell <nickm@us.ibm.com>
1667d1e to
75e6414
Compare
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
BREAKING CHANGE: Removed automatic ollama model pulling functionality
This PR removes ollama installation and model pulling from CI/CD workflows and adds support for local embedding models via mistral.rs backend.
Changes
Docker & Infrastructure Cleanup
CI/CD Workflow Changes
Embedding Backend Implementation
Embedding System Refactoring
CLI Default Model Changes
Cargo Configuration
Test Updates
Documentation
Net Impact
Users will need to manually pull ollama models before use if using ollama backend.
Made with Bob